Patent application title:

Cell Cycle Genes and Related Methods

Publication number:

US20100122382A1

Publication date:
Application number:

12/555,853

Filed date:

2009-09-09

Abstract:

Novel plant polysaccharide synthesis genes and polypeptides encoded by such genes are provided. These genes and polynucleotide sequences are useful regulating polysaccharide synthesis and plant phenotype. Moreover, these genes are useful for expression profiling of plant polysaccharide synthesis genes. The invention specifically provides cell cycle polynucleotide and polypeptide sequences isolated from Eucalyptus and Pinus.

Inventors:

Interested in similar patents?

Get notified when new applications in this technology area are published.

Classification:

C07K14/4738 »  CPC further

Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals not used Cell cycle regulated proteins, e.g. cyclin, CDC, INK-CCR

C12N15/8261 »  CPC further

Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor; Recombinant DNA-technology; Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression; Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs); Phenotypically and genetically modified plants via recombinant DNA technology with agronomic (input) traits, e.g. crop yield

Y02A40/146 »  CPC further

Adaptation technologies in agriculture, forestry, livestock or agroalimentary production in agriculture Genetically Modified [GMO] plants, e.g. transgenic plants

A01H5/00 IPC

Products

A01H5/00 IPC

Angiosperms, i.e. flowering plants, characterised by their plant parts; Angiosperms characterised otherwise than by their botanic taxonomy

C07H21/04 IPC

Compounds containing two or more mononucleotide units having separate phosphate or polyphosphate groups linked by saccharide radicals of nucleoside groups, e.g. nucleic acids with deoxyribosyl as saccharide radical

C12N15/63 IPC

Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor; Recombinant DNA-technology Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression

C12N5/10 IPC

Undifferentiated human, animal or plant cells, e.g. cell lines; Tissues; Cultivation or maintenance thereof; Culture media therefor Cells modified by introduction of foreign genetic material

C07K14/415 »  CPC main

Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from plants

C12Q1/68 IPC

Measuring or testing processes involving enzymes, nucleic acids or microorganisms ; Compositions therefor; Processes of preparing such compositions involving nucleic acids

C40B40/06 IPC

Libraries , e.g. arrays, mixtures; Libraries containing only organic compounds Libraries containing nucleotides or polynucleotides, or derivatives thereof

C40B30/04 IPC

Methods of screening libraries by measuring the ability to specifically bind a target molecule, e.g. antibody-antigen binding, receptor-ligand binding

Description

CROSS REFERENCE TO RELATED APPLICATION

This application claims priority to U.S. provisional application Ser. No. 60/533,036, filed on Dec. 30, 2003, which is specifically incorporated in its entirety herein by reference.

FIELD OF THE INVENTION

The present invention relates generally to the field of plant cell cycle genes and polypeptides encoded by such genes, and the use of such polynucleotide and polypeptide sequences for regulating a plant cell cycle. The invention specifically provides cell cycle polynucleotide and polypeptide sequences isolated from Eucalyptus and Pinus and sequences related thereto.

BACKGROUND OF THE INVENTION

Cell growth and division are controlled by the temporal expression of different sets of genes, allowing the dividing cell to progress through the different phases of the cell cycle. Continued growth and organogesis in plants requires precise function of the cell cycle machinery. Plant development, which is directly affected by cell division rates and patterns, also is influenced by environmental factors, such as temperature, nutrient availability, light, etc. See Gastal and Nelon, Plant Physiol. 105:191-7 (1994), Ben-Haj-Sahal and Tardieu, Plant Physiol. 109:861-7 (1995), and Sacks et al., Plant Physiol. 114:519-27 (1997). Plant development and phenotype are connected with the cell cycle, and altering expression of the genes involved in the cell cycle can be a useful method of modifying plant development and altering plant phenotype.

The ability to alter expression of cell cycle genes is extremely powerful because the cell cycle drives plant development, including growth rates, responses to environmental cues, and resulting plant phenotype. Control of the plant cell cycle and phenotypes associated with alteration of cell cycle gene expression, in the vascular cambium, in particular, has applications for, inter alia, alteration of wood properties and, in particular, lumber and wood pulp properties. For example, improvements to wood pulp that can be effected by altering cell cycle gene expression include increased or decreased lignin and cellulose content, and altered length, diameter, and lumen diameter of cells. Manipulating the plant cell cycle, and in particular the cambium cell cycle (i.e. the rate and angle of cell division), can also engineer better lumber having increased dimensional stability, increased tensile strength, increased shear strength, increased compression strength, increased shock resistance, increased stiffness, increased or decreased hardness, decreased spirality, decreased shrinkage, and desirable characteristics with respect to weight, density, and specific gravity.

A. Cell Cycle Genes and Proteins

1. Cyclin Dependent Protein Kinase

Progression through the cell cycle is regulated primarily by cyclin-dependent kinases (CDKs). CDKs are a conserved family of eukaryotic serine/threonine protein kinases, which require heterodimer formation with a cyclin subunit for activity. For review see, e.g. Joubes et al., Plant Mol. Biol. 43: 607-20 (2000), Stals and Inze, Trends Plant Sci. 6:359-64 (2001), and John et al., Protoplasma 216: 119-42 (2001).

The are five subclasses of CDK's, each having a different cyclin binding consensus sequence. In CDK type A the cyclin binding consensus sequence is PSTAIRE. Id. The cyclin binding consensus sequence in CDK types B-1, B-2, and C are PPTTLRE, PPTALRE, and PITAIRE, respectively. Joubes et al, Plant Physiol, 126: 1403-15 (2001).

Cell cycle progression is directed, in part, by changes in CDK activity. CDK activity is modulated by a number of different cell cycle protein components, such as changes in the abundance of individual cyclins due to changing rates of biosynthesis and proteolysis. Fluctuations in cyclin concentrations result in commensurate fluctuations in CDK activity. Cyclin accumulation is especially important in terminating the G1 phase of the cell cycle because DNA replication is initiated by an increase in CDK activity.

Activation of CDK also requires phosphorylation of a threonine residue within the T-loop of CDK by a CDK-activating kinase (CAK). Umeda et al., Proc. Nat'l Acad. Sci. U.S.A. 97: 13396-400 (2000). It was suggested by Yamaguchi et al., Plant J. 24: 11-20 (2000), that cyclin H is a regulatory subunit of CAK. CDK activity is further regulated by interaction with a CDK regulatory subunit, a small (70-100 AA) protein involved in cell cycle regulation.

A cell must exit the cell cycle in order to commit to differentiation, senescence or apoptosis. This process involves the down-regulation of CDK activities. CDK inhibitors (CKI) are low molecular weight proteins, which are important for cell cycle regulation and development. CKIs bind stoichiometrically to CDK and down-regulate the activity of CDKs.

Many biochemical properties of ICK1, the first plant CKI to be identified from Arabidopsis thaliana, are known. Wang et al., Nature 386:451-2 (1997) Wang et al., Plant J. 24: 613-23 (2000). ICK1 is expressed at low levels in many tissue types, and there can be a threshold level of ICK1 that must be overcome before a cell can enter the cell cycle. Wang et al., Plant J. 24: 613-23 (2000). ICK1 is induced by the plant growth regulator abscisic acid (ABA), which inhibits cell division by blocking DNA replication. When the expression of ICK1 increases, there is a corresponding decrease in Cdc2-like H1 histone activity. ICK1 has been shown to bind in vitro with the cyclins C2c2a and CycD3, and deletion experiments have identified different domain regions for these two interactions.

Altering the expression of CDK regulatory protein or a subunit thereof is known to cause changes in plant phenotype. Overexpression of the Arabidopsis CDK regulatory subunit, CKS1At, resulted in a reduction of leaf size, root growth rates and meristem size. Additionally, overexpression of CKS1At resulted in inhibition of cell-cycle progression, with an extension in the duration of the G1 and G2 phases of the cell cycle.

2. Cyclins

Cyclins are positive regulatory subunits of cyclin-dependent kinase (CDK) enzymes and are required for CDK activity. Fowler et al., Mol. Biotech. 10, 123, 126. Cyclins and CDK complexes provide temporal regulation of transition through the cell cycle. Evidence also suggests that cyclins provide spatial regulation of specific CDK activity, differentially targeting the cytoskeleton, spindle, phragmoplast, nuclear envelope, and chromosomes.

Plant cyclins are classified into five major groups: A, B, C, D, and H. Renaudin et al., Plant Mol. Biol. 32: 1003-18 (1996) and Yamaguchi et al., (supra 2000). Cyclins can be divided into mitotic cyclins (A and B) and G1 cyclins.

The mitotic cyclins possess a consensus sequence (R-x-x-L-x-x-I-x-N) located at the N-terminal region, termed a destruction box, adjacent to a lysine-rich region. The destruction box and lysine-rich region target the mitotic cyclins for ubiquitin-dependent proteolysis during mitosis. Stals, supra at 361, and Fowler, supra at 126. The destruction box in A versus B cyclins differs slightly and this difference is thought to result in slightly different timing of degradation of A versus B cyclins. Fowler, supra at 126. A-type cyclins accumulate during the S, G2, and early M phase of the cell cycle, whereas B-type cyclins accumulate during the late G2 and early M phase. Mironov et al., Plant Cell 11: 509-22 (1999). Three subgroups of A-type cyclins are known in plants, but only one is known in animals. Cyclin A1 (cycA1;zm;1 from Zea cans) is most concentrated during cytokinesis at the microtubule-containing phragmoplast. Expression of cyclin A2 is upregulated by auxins in roots, and by cytokinins in the shoot apex. Abrahams et al., Biochim. Biophys. Acta 28: 1-2 (2001).

D-type cyclins, of which five subgroups are known, are thought to control the progression through the G1 phase in response to growth factors and nutrients. Riou-Khamlichi et al., Mol. Cell Biol. 20: 4513-21 (2000). For example, the expression of D-type cyclins is upregulated by sucrose as shown by an increase in cycD2 mRNA 30 minutes after sucrose exposure, and an increase in cycD3 four hours after sucrose exposure. This timing corresponds to early G1-phase and late G1-phase, respectively. Cockcroft et al., Nature 405: 575-9 (2000). Furthermore, in Arabidopsis, a D3 cyclin was shown to be upregulated by the brassinosteroid, epi-brassinolide.

Cyclin D2 proteins bind with CDKA to produce an active complex, which binds to and phosphorylates retinoblastoma-related protein (Rb). This process is found in actively proliferating tissue, suggesting it plays an important function during late G1- and early S-phase. Three different D3-type cyclins are active during tomato fruit development. These proteins all contain a retinoblastoma binding motif and a PEST-destruction motif. There are differences in the spatial and temporal expression of these D3 cyclins, inferring different roles during fruit development.

Overexpression of cyclin D was shown to increase overall growth rate. Over-expression of cyclin D2 in tobacco increases causes shortening the G1-phase which producing a faster rate of cell cycling.

C- and H-type cyclins were characterized in poplar (Populus tremula×tremuloides) and rice (Oryza sativa) but their exact function is still unclear. Putative cyclins with a lesser degree of peptide sequence conservation have also been identified. For example, Arabidopsis CycJ18 has only 20% identity with homologues over the cyclin box domain. CycJ18 is expressed predominantly in young seedlings. Arabidopsis F3O9.13 protein also has similarity to the cyclin family.

3. Histone Acetyltransferase/Deacetyltransferase

Histone acetyltransferase (HA) and histone deacetyltransferase (HAD) control the net level of acetylation of histones. Histone acetylation and deacetylation are thought to exert their regulatory effects on gene expression by altering the accessibility of nucleosomal DNA to DNA-binding transcriptional activators, other chromatin-modifying enzymes or multi-subunit chromatin remodeling complexes capable of displacing nucleosomes. Lusser et al., Nucleic Acids Res. 27: 4427-35 (1999). Therefore, in general, the HDAs are involved in the repression of gene expression, while HAs are correlated with gene activation.

HA effects acetylation at the ε-amino group of conserved lysine residues clustered near the amino terminus of core histones which up-regulates gene expression.

HDAs remove acetyl groups from the core histones of the nucleosome. There are numerous family members in the HDA group, many of which are conserved throughout evolution. Lechner et al., Biochim Biophys Acta 5:181-8 (1996). HDAs function as part of multi-protein complexes facilitating chromatin condensation.

HDAs and HAs recognize highly distinct acetylation patterns on the nucleosome. It is thought that different types of HDAs interact with specific regions of the genome, to influence gene silencing.

Schultz et al., Genes Dev. 15: 428-43 (2001), demonstrated that the superfamily of Kruppel-associated-box zinc finger proteins (KRAB-ZFPs) are linked to the nucleosome remodeling and histone deacetylation complex via the PHD (plant homeodomain) and bromodomains of co-repressor KAP-1, to form a cooperative unit that is required for transcriptional repression. A maize HDAC (HD2) has been identified that has no sequence homology to other eukaryotic HDACs, but instead contains sequence similarity to peptidyl-prolyl cis-trans isomerases (PPIases).

The effects of interfering with histone deacetylation are discussed in e.g. Tian and Chen, Proc. Nat'l Acad. Sci. USA 98: 200-5 (2001).

4. Peptidyl Prolyl Cis-Trans Isomerase

Peptidylprolyl isomerases (e.g., peptidylprolyl cis-trans isomerase, peptidyl-prolyl cis-trans isomerase, PPIase, rotamase, cyclophilin) catalyze the interconversion of peptide bonds between the cis and trans conformations at proline residues. Sheldon and Venis, Biochem J. 315: 965-70 (1996). This interconversion is thought to be the rate limiting step of protein folding. PPIases belong to a conserved family of proteins that are present in animals, fungi, bacteria and plants. PPIases are implicated in a number of responses including the response to environmental stress, calcium signals, transcriptional repression, cell cycle control, etc. Viaud, et al., Plant Cell 14: 917-30 (2002).

5. Retinoblastoma-Related Protein

Retinoblastoma (Rb)-related protein putatively regulates progression of the cell cycle through the G1 phase and into S phase. Xie et al., EMBO J. 15: 4900-8 (1996) and Ach et al., Mol. Cell Biol. 17: 5077-86 (1997).

Although Rb is well-characterized in mammalian systems, the role of Rb-related proteins in regulation of G1 phase progression and S phase entry is not well characterized in plants. It is known, however, that RB-related protein functions through its association with various other cellular proteins involved in cell cycle regulation, such as the cyclins, WD40 proteins, Soni et al., Plant. Cell. 7:85-103 (1995); Grafi et al., Proc. Natl. Acad. Sci. U.S.A. 93:8962 (1996); Ach et al., Plant Cell 9:1595-606 (1997); Umen and Goodenough, Genes Dev. 15:1652-61 (2001); Mariconti et al., J. Biol. Chem. 277:9911-9 (2002).

6. WD40 Repeat Protein

WD40 is a common repeating motif involved in many different protein-protein interactions. The WD40 domain is found in proteins having a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly. Goh et al., Eur. J. Biochem. 267: 434-49 (2000).

The WD40 domain, which is 40 residues long, typically contains a GH dipeptide 11-24 residues from the N-terminus and the WD dipeptide at the C-terminus. Id. Between the GH dipeptide and the WD dipeptide lies a conserved core which serves as a stable platform where proteins can bind either stably or reversibly. The core forms a propeller-like structure with several blades. Each blade is composed of a four-stranded anti-parallel β-sheet. Each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade. The last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure. The residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands.

Studies in yeast demonstrated that Cdc20, which contains the WD40 motif, is required for the proteolysis of mitotic cyclins. This process is mediated by an ubiquitin-protein ligase called anaphase-promoting complex (APC) or cyclosome. Following ubiquitination and proteolysis by the 26S proteasome, the cell can segregate chromosomes, and exit from mitosis. Cdc20 also contains a destruction-box domain.

7. WEE1-Like Protein

WEE1 controls the activity of cyclin-dependent kinases. WEE1 itself is a serine/threonine kinase. Sorrell et al., Planta 215: 518-22 (2002). The enzymatic activity of these protein kinases is controlled by phosphorylation of specific residues in the activation segment of the catalytic domain, sometimes combined with reversible conformational changes in the C-terminal autoregulatory tail. This process is conserved among eukaryotes, from fungi to animals and plants. Similarly, there is a high degree of homology between WEE1 proteins from various organisms. For example, there is 50% identity between the protein kinase domains of the human and maize WEE1 proteins.

Expression of WEE1 is shown to occur only in actively dividing tissues and is believed to inhibit cell division by acting as a negative regulator of mitosis. WEE1 is believed to prevent entry from G2 to M by protecting the nucleus from cytoplasmically-activated cyclin B1-complexed CDC2 before the onset of mitosis. For example, over-expression of AtWEE1 (from Arabidopsis) and ZmWEE1 (from Zea cans) in fission yeast inhibits cell division which results in elongated cells. Sun et al., Proc. Nat'l Acad. Sci. USA 96: 4180-5 (1999).

B. Expression Profiling and Microarray Analysis in Plant Development

The multigenic control of plant phenotype presents difficulties in determining the genes responsible for phenotypic determination. One major obstacle to identifying genes and gene expression differences that contribute to phenotype in plants is the difficulty with which the expression of more than a handful of genes can be studied concurrently. Another difficulty in identifying and understanding gene expression and the interrelationship of the genes that contribute to plant phenotype is the high degree of sensitivity to environmental factors that plants demonstrate.

There have been recent advances using genome-wide expression profiling. In particular, the use of DNA microarrays has been useful to examine the expression of a large number of genes in a single experiment. Several studies of plant gene responses to developmental and environmental stimuli have been conducted using expression profiling. For example, microarray analysis was employed to study gene expression during fruit ripening in strawberry, Aharoni et al., Plant Physiol. 129:1019-1031 (2002), wound response in Arabodopsis, Cheong et al., Plant Physiol. 129:661-7 (2002), pathogen response in Arabodopsis, Schenk et al., Proc. Nat'l Acad. Sci. 97:11655-60 (2000), and auxin response in soybean, Thibaud-Nissen et al., Plant Physiol. 132:118. Whetten et al., Plant Mol. Biol. 47:275-91 (2001) discloses expression profiling of cell wall biosynthetic genes in Pinus taeda L. using cDNA probes. Whetten et al. examined genes which were differentially expressed between differentiating juvenile and mature secondary xylem. Additionally, to determine the effect of certain environmental stimuli on gene expression, gene expression in compression wood was compared to normal wood. 156 of the 2300 elements examined showed differential expression. Whetten, supra at 285. Comparison of juvenile wood to mature wood showed 188 elements as differentially expressed. Id. at 286.

Although expression profiling and, in particular, DNA microarrays provide a convenient tool for genome-wide expression analysis, their use has been limited to organisms for which the complete genome sequence or a large cDNA collection is available. See Hertzberg et al., Proc. Nat'l Acad. Sci. 98:14732-7 (2001a), Hertzberg et al., Plant J., 25:585 (2001b). For example, Whetten, supra, states, “A more complete analysis of this interesting question awaits the completion of a larger set of both pine and poplar ESTs.” Whetten et al. at 286. Furthermore, microarrays comprising cDNA or EST probes may not be able to distinguish genes of the same family because of sequence similarities among the genes. That is, cDNAs or ESTs, when used as microarray probes, may bind to more than one gene of the same family.

Methods of manipulating gene expression to yield a plant with a more desirable phenotype would be facilitated by a better understanding of cell cycle gene expression in various types of plant tissue, at different stages of plant development, and upon stimulation by different environmental cues. The ability to control plant architecture and agronomically important traits would be improved by a better understanding of how cell cycle gene expression effects formation of plant tissues, how cell cycle gene expression causes plant cells to enter or exit cell division, and how plant growth and the cell cycle are connected. Among the large number of genes, the expression of which can change during development of a plant, only a fraction are likely to effect phenotypic changes during any given stage of the plant development.

SUMMARY

Accordingly, there is a need for tools and methods useful in determining the changes in the expression of cell cycle genes that occur during the plant cell cycle. There is also a need for polynucleotides useful in such methods. There is a further need for methods which can correlate changes in cell cycle gene expression to phenotype or stage of plant development. There is a further need for methods of identifying cell cycle genes and gene products that impact plant phenotype, and that can be manipulated to obtain a desired phenotype.

In one aspect, the present invention provides an isolated polynucleotide comprising a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 1-237 and conservative variants thereof.

In another aspect, the present invention provides a DNA construct comprising at least one polynucleotide having the sequence of any one of SEQ ID NOs: 1-237 and conservative variants thereof.

Another aspect of the invention is a plant cell transformed with a DNA construct of comprising a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 1-237 and conservative variants thereof.

A further aspect of the invention is a transgenic plant comprising a plant cell transformed with a DNA construct comprising a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 1-237 and conservative variants thereof.

Another aspect of the invention is an isolated polynucleotide comprising a sequence encoding the catalytic or substrate-binding domain of a polypeptide selected from of any one of SEQ ID NOs: 261-497, wherein the polynucleotide encodes a polypeptide having the activity of said polypeptide selected from any one of SEQ ID NOs: 261-497.

A further aspect of the invention is a method of making a transformed plant comprising transforming a plant cell with a DNA construct comprising at least one polynucleotide having the sequence of any of SEQ ID NOs: 1-237; and culturing the transformed plant cell under conditions that promote growth of a plant.

In another aspect, the invention provides a wood obtained from a transgenic tree.

In a further aspect, the invention provides a wood pulp obtained from a transgenic tree which has been transformed with the DNA construct of the invention.

Another aspect of the invention is a method of making wood, comprising transforming a plant with a DNA construct comprising a polynucleotide having a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 1-237 and conservative variants thereof; culturing the transformed plant under conditions that promote growth of a plant; and obtaining wood from the plant.

The invention further provides a method of making wood pulp, comprising transforming a plant with a DNA construct comprising a polynucleotide having a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 1-237 and conservative variants thereof; culturing the transformed plant under conditions that promote growth of a plant; and obtaining wood pulp from the plant.

In another aspect, the invention provides an isolated polypeptide comprising an amino acid sequence encoded by the isolated polynucleotide comprising a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 1-237 and conservative variants thereof.

The invention also provides, an isolated polypeptide comprising an amino acid sequence selected from the group consisting of 261-497.

The invention further provides a method of altering a plant phenotype of a plant, comprising altering expression in the plant of a polypeptide encoded by any one of SEQ ID NOs: 1-237.

In another aspect, the invention provides a polynucleotide comprising a nucleic acid selected from the group comprising of SEQ ID NOs: 471-697.

An aspect of the invention is a method of correlating gene expression in two different samples, comprising detecting a level of expression of one or more genes encoding a product encoded by a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 1-237 and conservative variants thereof in a first sample; detecting a level of expression of the one or more genes in a second sample; comparing the level of expression of the one or more genes in the first sample to the level of expression of the one or more genes in the second sample; and correlating a difference in expression level of the one or more genes between the first and second samples.

A further aspect of the invention is a method of correlating the possession of a plant phenotype to the level of gene expression in the plant of one or more genes comprising detecting a level of expression of one or more genes encoding a product encoded by a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 1-237 and conservative variants thereof in a first plant possessing a phenotype; detecting a level of expression of the one or more genes in a second plant lacking the phenotype; comparing the level of expression of the one or more genes in the first plant to the level of expression of the one or more genes in the second plant; and correlating a difference in expression level of the one or more genes between the first and second plants to possession of the phenotype.

In a further aspect, the invention provides a method of correlating gene expression to a stage of the cell cycle, comprising detecting a level of expression of one or more genes encoding a product encoded by a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 1-237 and conservative variants thereof in a first plant cell in a first stage of the cell cycle; detecting a level of expression of the one or more genes in a second plant cell in a second, different stage of the cell cycle; comparing the level of the expression of the one or more genes in the first plant cells to the level of expression of the one or more genes in the second plants cells; and correlating a difference in expression level of the one or more genes between the first and second samples to the first or second stage of the cell cycle.

An aspect of the invention is a combination for detecting expression of one or more genes, comprising two or more oligonucleotides, wherein each oligonucleotide is capable of hybridizing to a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 1-237.

Another aspect of the invention is a combination for detecting expression of one or more genes, comprising two or more oligonucleotides, wherein each oligonucleotide is capable of hybridizing to a nucleic acid sequence encoded by a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 1-237.

The invention further provides a microarray comprising a combination for detecting expression of one or more genes, comprising two or more oligonucleotides, wherein each oligonucleotide is capable of hybridizing to a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 1-237 or wherein each oligonucleotide is capable of hybridizing to a nucleic acid sequence encoded by a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 1-237, wherein each of said two or more oligonucleotides occupies a unique location on said solid support.

In another aspect, the invention provides a method for detecting one or more genes in a sample, comprising contacting the sample with two or more oligonucleotides, wherein each oligonucleotide is capable of hybridizing to a gene comprising a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 1-237 under standard hybridization conditions; and detecting the one or more genes of interest which are hybridized to the one or more oligonucleotides.

The invention also provides a method for detecting one or more nucleic acid sequences encoded by one or more genes in a sample, comprising contacting the sample with two or more oligonucleotides, wherein each oligonucleotide is capable of hybridizing to a nucleic acid sequence encoded by a gene comprising a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 1-237 under standard hybridization conditions; and detecting the one or more nucleic acid sequences which are hybridized to the one or more oligonucleotides.

The invention further provides a kit for detecting gene expression comprising the microarray of the invention together with one or more buffers or reagents for a nucleotide hybridization reaction.

Other features, objects, and advantages of the present invention are apparent from the detailed description that follows. It should be understood, however, that the detailed description, while indicating preferred embodiments of the invention, are given by way of illustration only, not limitation. Various changes and modifications within the spirit and scope of the invention will be apparent to those skilled in the art from the detailed description.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1: Exemplary microarray sampling parameters.

FIG. 2: Plasmid map for pWVK202.

FIG. 3: Plasmid map for pGrowth14.

FIG. 4: Plasmid map for pGrowth15.

FIG. 5: Plasmid map for pGrowth16.

FIG. 6: Plasmid map for pGrowth18.

FIG. 7: Plasmid map for pGrowth19.

FIG. 8: Plasmid map for pGrowth20.

LIST OF TABLES

Table 1: shows genes having greater than doubled signal with any one sample as compared to the mean signal of the other three samples.

Table 2: identifies plasmid(s), genes, and Genesis ID numbers for constructs described in Example 17.

Table 3: Rooting medium for Populus deltoids.

Table 4: pGrowth information.

Table 5: shows genes having greater than doubled signal with any one sample as compared to the mean signal of the other three samples.

Table 6: Differentially expressed cDNAs.

Table 7: Consensus ID information.

Table 8: pGrowth information.

Table 9: Eucalyptus grandis cell cycle genes and proteins.

Table 10: Pinus radiata cell cycle genes and proteins.

Table 11: Annotated peptide sequences of the present invention.

Table 12: Eucalyptus in silico data.

Table 13: Pine in silico data.

Table 14: Oligo table.

Table 15: Peptide table.

Table 16: BLAST sequence alignment table.

DETAILED DESCRIPTION

The inventors have discovered novel isolated cell cycle genes and polynucleotides useful for identifying the multigenic factors that contribute to a phenotype and for manipulating gene expression to affect a plant phenotype. These genes, which are derived from plants of commercially important forestry genera, pine and eucalyptus, are involved in the plant cell cycle and are, at least in part, responsible for expression of phenotypic characteristics important in commercial wood, such as stiffness, strength, density, fiber dimensions, coarseness, cellulose and lignin content, and extractives content. Generally speaking, the genes and polynucleotides encode a protein which can be a cyclin, cyclin dependent kinase, cyclin dependent kinase inhibitor, histone acetyltransferase, histone deacetylase, peptidyl-prolyl cis-trans isomerase, retinoblastoma-related protein, WEE1-like protein, or WD40 repeat protein, or a catalytic domain thereof, or a polypeptide having the same function, and the invention further includes such proteins and polypeptides.

The methods of the present invention for selecting cell cycle gene sequences to target for manipulation will permit better design and control of transgenic plants with more highly engineered phenotypes. The ability to control plant architecture and agronomically important traits in commercially important forestry species will be improved by the information obtained from the methods, such as which genes affect which phenotypes, which genes affect entry into which stage of the cell cycle, which genes are active in which stage of plant development, and which genes are expressed in which tissue at a given point in the cell cycle or plant development.

Unless indicated otherwise, all technical and scientific terms are used herein in a manner that conforms to common technical usage. Generally, the nomenclature of this description and the described laboratory procedures, including cell culture, molecular genetics, and nucleic acid chemistry and hybridization, respectively, are well known and commonly employed in the art. Standard techniques are used for recombinant nucleic acid methods, oligonucleotide synthesis, cell culture, tissue culture, transformation, transfection, transduction, analytical chemistry, organic synthetic chemistry, chemical syntheses, chemical analysis, and pharmaceutical formulation and delivery. Generally, enzymatic reactions and purification and/or isolation steps are performed according to the manufacturers' specifications. Absent an indication to the contrary, the techniques and procedures in question are performed according to conventional methodology disclosed, for example, in Sambrook et al., MOLECULAR CLONING A LABORATORY MANUAL, 2d ed. (Cold Spring Harbor Laboratory Press, 1989), and CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, John Wiley & Sons, 1989). Specific scientific methods relevant to the present invention are discussed in more detail below. However, this discussion is provided as an example only, and does not limit the manner in which the methods of the invention can be carried out.

A. Plant Cell Cycle Genes and Proteins

1. Cell Cycle Genes, Polynucleotide and Polypeptide Sequences

One aspect of the present invention relates to novel plant cell cycle genes and polypeptides encoded by such genes. As used herein, the term “plant cell cycle genes” refers to genes encoding proteins that function during the plant cell cycle, and the term “plant cell cycle proteins” refers to proteins that function during the plant cell cycle. There are several known families of plant cell cycle proteins, including cyclin, cyclin dependent kinase, cyclin dependent kinase inhibitor, histone acetyltransferase, histone deacetylase, peptidyl-prolyl cis-trans isomerase, retinoblastoma-related protein, WEE1-like protein, and WD40 repeat protein. Although there is significant sequence homology within each gene and protein family, each member of each family can display different biochemical properties and altering the expression of at least one of these genes can result in a different plant phenotype.

The present invention provides novel plant cell cycle genes and polynucleotides and novel cell cycle proteins and polypeptides. In accordance with one embodiment of the invention, the novel plant cell cycle genes are the same as those expressed in a wild-type plant of a species of Pinus or Eucalyptus. Exemplary novel plant cell cycle gene sequences of the invention are set forth in Tables 9 and 10, which depict Eucalyptus grandis sequences and Pinus radiata sequences, respectively. Corresponding gene products, i.e., oligonucleotides and polypeptides, are also listed in Tables 14, 15, and 16. The Sequence Listing in APPENDIX 1 provides the sequences of these aspects of the invention.

The sequences of the invention have cell cycle activity and encode proteins that are active in the cell cycle, such as proteins of the cell cycle families discussed above. As discussed in more detail below, manipulation of the expression of the cell cycle genes and polynucleotides, or manipulation of the activity of the encoded proteins and polypeptides, can result in a transgenic plant with a desired phenotype that differs from the phenotype of a wild-type plant of the same species.

Throughout this description, reference is made to cell cycle gene products. As used herein, a “cell cycle gene product” is a product encoded by a cell cycle gene, and includes both nucleotide products, such as RNA, and amino acid products, such as proteins and polypeptides. Examples of specific cell cycle genes of the invention include SEQ ID NOs: 1-237. Examples of specific cell cycle gene products of the invention include products encoded by any one of SEQ ID NOs: 1-237. Reference also is made herein to cell cycle proteins and cell cycle polypeptides. Examples of specific cell cycle proteins and polypeptides of the invention include polypeptides encoded by any of SEQ ID NOs: 1-237 or polypeptides comprising the amino acid sequence of any of SEQ ID NOs: 261-497. One aspect of the invention is directed to a subset of these cell cycle genes and cell cycle gene products, namely SEQ ID NOs: 1-12, 14-58, 60-62, 64-70, 72-75, 77-83, 85-86, 88-91, 93-119, 121-130, 132-148, 150-156, 158-191, 193-207, 209-218, 220-221, 223-231, 233-237, their respective conservative variants (as that term is defined below), and the nucleotide and amino acid products encoded thereby. Another aspect of the invention is directed to a subset of the cell cycle genes and cell cycle gene products, namely SEQ ID NOs: 1-12, 14, 16-26, 30-37, 40-41, 43-76, 78-103, 106, 108-113, 116-121, 124-125, 128-147, 150-152, 154-155, 161-162, 164-172, 174, 177-183, 185-191, 193-197, 200-204, 208-213, and 215-234 their respective conservative variants, and the nucleotide and amino acid products encoded thereby. A further aspect of the invention is directed to a subset of the cell cycle genes and cell cycle gene products, namely SEQ ID NOs: 1-12, 14, 16-26, 30-37, 40-41, 43-58, 60-62, 64-70, 72-75, 78-83, 85-86, 88-91, 93-103, 106, 108-113, 116-119, 121, 124-125, 128-130, 132-147, 150-152, 154-155, 161-162, 164-172, 174, 177-183, 185-191, 193-197, 200-204, 209-213, 215-218, 220-221, 223-231, and 233-234 their respective conservative variants, and the nucleotide and amino acid products encoded thereby.

The present invention also includes sequences that are complements, reverse sequences, or reverse complements to the nucleotide sequences disclosed herein.

The present invention also includes conservative variants of the sequences disclosed herein. The term “variant,” as used herein, refers to a nucleotide or amino acid sequence that differs in one or more nucleotide bases or amino acid residues from the reference sequence of which it is a variant.

Thus, in one aspect, the invention includes conservative variant polynucleotides. As used herein, the term “conservative variant polynucleotide” refers to a polynucleotide that hybridizes under stringent conditions to an oligonucleotide probe that, under comparable conditions, binds to the reference gene the conservative variant is a variant of. Thus, for example, a conservative variant of SEQ ID NO: 1 hybridizes under stringent conditions to an oligonucleotide probe that, under comparable conditions, binds to SEQ ID NO: 1. One aspect of the invention provides conservative variant polynucleotides that exhibit at least about 75% sequence identity to their respective reference sequences.

“Sequence identity” has an art-recognized meaning and can be calculated using published techniques. See COMPUTATIONAL MOLECULAR BIOLOGY, Lesk, ed. (Oxford University Press, 1988), BIOCOMPUTING: INFORMATICS AND GENOME PROJECTS, Smith, ed. (Academic Press, 1993), COMPUTER ANALYSIS OF SEQUENCE DATA, PART I, Griffin & Griffin, eds., (Humana Press, 1994), SEQUENCE ANALYSIS IN MOLECULAR BIOLOGY, Von Heinje ed., Academic Press (1987), SEQUENCE ANALYSIS PRIMER, Gribskov & Devereux, eds. (Macmillan Stockton Press, 1991), and Carillo & Lipton, SIAM J. Applied Math. 48: 1073 (1988). Methods commonly employed to determine identity or similarity between two sequences include but are not limited to those disclosed in GUIDE TO HUGE COMPUTERS, Bishop, ed., (Academic Press, 1994) and Carillo & Lipton, supra. Methods to determine identity and similarity are codified in computer programs. Preferred computer program methods to determine identity and similarity between two sequences include but are not limited to the GCG program package (Devereux et al., Nucleic Acids Research 12: 387 (1984)), BLASTP, BLASTN, FASTA (Atschul et al., J. Mol. Biol. 215: 403 (1990)), and FASTDB (Brutlag et al., Comp. App. Biosci. 6: 237 (1990)).

The invention includes conservative variant polynucleotides having a sequence identity that is greater than or equal to 99%, 98%, 97%, 96%, 95%, 94%, 93%, 92%, 91%, 90%, 89%, 88%, 87%, 86%, 85%, 84%, 83%, 82%, 81%, 80%, 79%, 78%, 77%, 76%, 75%, 74%, 73%, 72%, 71%, 70%, 69%, 68%, 67%, 66%, 65%, 64%, 63%, 62%, 61%, or 60% to any one of SEQ ID NOs: 1 to 237. In such variants, differences between the variant and the reference sequence can occur at the 5′ or 3′ terminal positions of the reference nucleotide sequence or anywhere between those terminal positions, interspersed either individually among nucleotides in the reference sequence or in one or more contiguous groups within the reference sequence.

Additional conservative variant polynucleotides contemplated by and encompassed within the present invention include polynucleotides comprising sequences that differ from the polynucleotide sequences of SEQ ID NO: 1-237, or complements, reverse complements or reverse sequences thereof, as a result of deletions and/or insertions totaling less than 10% of the total sequence length.

The invention also includes conservative variant polynucleotides that, in addition to sharing a high degree of similarity in their primary structure (sequence) to SEQ ID NOs: 1 to 237, have at least one of the following features: (i) they contain an open reading frame or partial open reading frame encoding a polypeptide having substantially the same functional properties in the cell cycle as the polypeptide encoded by the reference polynucleotide, or (ii) they have nucleotide domains or encoded protein domains in common. The invention includes conservative variants of SEQ ID NOs: 1-237 that encode proteins having the enzyme or biological activity or binding properties of the protein encoded by the reference polynucleotide. Such conservative variants are functional variants, in that they have the enzymatic or binding activity of the protein encoded by the reference polynucleotide.

In accordance with the invention, polynucleotide variants can include a “shuffled gene” such as those described in e.g. U.S. Pat. Nos. 6,500,639, 6,500,617 6,436,675, 6,379,964, 6,352,859 6,335,198 6,326,204, and 6,287,862. A variant of a nucleotide sequence of the present invention also can be a polynucleotide modified as disclosed in U.S. Pat. No. 6,132,970, which is incorporated herein by reference.

In accordance with one embodiment, the invention provides a polynucleotide that encodes a cell cycle protein from one of the following families: cyclin, cyclin dependent kinase, cyclin dependent kinase inhibitor, histone acetyltransferase, histone deacetylase, peptidyl-prolyl cis-trans isomerase, retinoblastoma-related protein, WEE1-like protein, or WD40 repeat protein. SEQ ID NOs: 1-237 provide examples of such polynucleotides.

In accordance with another embodiment, a polynucelotide of the invention encodes the catalytic or protein binding domain of a polypeptide encoded by any of SEQ ID NOs: 1-237 or of a polypeptide comprising any of SEQ ID NOs: 261-497. The catalytic and protein binding domains of the cell cycle proteins of the invention are known in the art. The conserved sequences of these proteins are shown in Entries 1-195 as underlined, bold, and/or italicized text.

The invention also encompasses as conservative variants polynucleotides that differ from the sequences discussed above but that, as a consequence of the degeneracy of the genetic code, encode a polypeptide which is the same as that encoded by a polynucleotide of the present invention. The invention also includes as conservative variants polynucleotides comprising sequences that differ from the polynucleotide sequences discussed above as a result of substitutions that do not affect the amino acid sequence of the encoded polypeptide sequence, or that result in conservative substitutions in the encoded polypeptide sequence.

The present invention also includes an isolated polypeptide encoded by a polynucleotide comprising any of SEQ ID NOs: 1-237 or any of the conservative variants thereof discussed above. The invention also includes polypeptides comprising SEQ ID NOs: 261-497 and 495-497 and conservative variants of these polypeptides. Another aspect of the invention include polypeptides comprising SEQ ID NOs: 261-272, 274-318, 320-322, 324-330, 332-335, 337-343, 345-346, 348-351, 353-379, 381-390, 392-408, 410-416, 418-451, 453-467, 469-478, 480-481, 483-491, and 493-494 and conservative variants thereof. A further aspect of the invention includes polypeptides comprising SEQ ID NOs: 261-272, 274, 276-286, 289, 290-297, 300-301, 303-345, 347-363, 366, 368-373, 376-381, 384-385, 388-407, 410-412, 414-415, 420-422, 424-432, 434, 437-443, 445-451, 453-457, 460-464, 468-473, and 475-494 and conservative variants thereof. Another aspect of the invention includes polypeptides comprising SEQ ID NOs: 261-272, 274, 276-286, 290-297, 300-301, 303-318, 320-322, 324-330, 332-335, 337-343, 345, 348-351, 353-363, 366, 368-373, 376-381, 384-385, 388-390, 392-407, 410-412, 414-415, 421-422, 424-432, 434, 437-443, 445-451, 453-457, 460-464, 469-473, 475-478, 480-481, 483-491, and 493-494 and conservative variants thereof.

In accordance with the invention, a variant polypeptide or protein refers to an amino acid sequence that is altered by the addition, deletion or substitution of one or more amino acids.

The invention includes conservative variant polypeptides. As used herein, the term “conservative variant polypeptide” refers to a polypeptide that has similar structural, chemical or biological properties to the protein it is a conservative variant of. Guidance in determining which amino acid residues can be substituted, inserted, or deleted can be found using computer programs well known in the art such as Vector NTI Suite (InforMax, MD) software. In one embodiment of the invention, conservative variant polypeptides that exhibit at least about 75% sequence identity to their respective reference sequences.

Conservative variant protein includes an “isoform” or “analog” of the polypeptide. Polypeptide isoforms and analogs refers to proteins having the same physical and physiological properties and the same biological function, but whose amino acid sequences differs by one or more amino acids or whose sequence includes a non-natural amino acid.

Polypeptides comprising sequences that differ from the polypeptide sequences of SEQ ID NO: 261-497 as a result of amino acid substitutions, insertions, and/or deletions totaling less than 10% of the total sequence length are contemplated by and encompassed within the present invention.

One aspect of the invention provides conservative variant polypeptides that have the same function in the cell cycle as the proteins of which they are variants, as determined by one or more appropriate assays, such as those described below. The invention includes variant polypeptides that function as cell cycle proteins, such as those having the biological activity of cyclin, cyclin dependent kinase, cyclin dependent kinase inhibitor, histone acetyltransferase, histone deacetylase, peptidyl-prolyl cis-trans isomerase, retinoblastoma-related protein, WEE1-like protein, and WD40 repeat protein, and are thus capable of modulating the cell cycle in a plant. As discussed above, the invention includes variant polynucleotides that encode polypeptides that function as cell cycle proteins.

The activities and physical properties of cell cycle proteins can be examined using any method known in the art. The following examples of assay methods are not exhaustive and are included to provide some guidance in examining the activity and distinguishing protein characteristics of cell cycle protein variants.

CDK activity can be assessed using roscovitine as described in Yamaguchi et al., Proc. Natl. Acad. Sci. U.S.A. 100:8019 (2003). CDK histone kinase activity can be assayed using autoradiography to detect histone H1 phosphorylation by CDK as described in Joubés et al., Plant Physiol. 121:857 (1999).

CKI activity can be assayed using a variation of the method described in Zhou et al., Planta. 6:604 (2003). The modified method can employ co-transformation or subsequent transformations to identify the interaction of CKI and cyclins in vivo. For example, in the first transformation pine tissue can be transformed using the method described in U.S. Patent Application Publication No. 2002/0100083 using geneticin selection to obtain transgenic plants possessing cycD3 and cdc2a homologs. The second transformation can be performed using alpha-methyltryptophan as a selectable marker to obtain transformants having an ICK1 homologue as described in U.S. Provisional Application No. 60/476,189. Tissue capable of growing on both on geneticin and on alpha-methyltryptophan contains the ICK1 homologue and the cycD3 and cdc2a homologues. The CKI activity is determined by comparison of the phenotype of transformants having the cycD3 and cdc2a homologues to the transformants having ICK1 homologue and the cycD3 and cdc2a homologs.

Histone deacetylase activity can be assessed by complementation of the Arabidopsis mutants described in Tian et al., Genetics 165:399 (2003). Histone acetyltransferase activity can be assayed using anacardic acid as described in Balasubramanyam et al., J. Biol. Chem. 278:19134 (2003). Histone acetyltransferase also can be assayed using trichostatin A-treated plant lines as is described in Bhat et al., Plant J. 33:455 (2003). The plant lines described in Bhat et al., supra, also can be used to assay retinoblastoma-related proteins using the co-precipitation method described in Rossi et al., Plant Mol. Biol. 51:401 (2003).

Peptidyl-prolyl isomerase can be assayed as described in Edvardsson et al., FEBS Lett. 542:137 (2003). WD40 proteins can be evaluated based on the possession of the WD40 motif as well as their ability to interact with cdc2. WEE-1 can be assayed using any kinase activity assay known in the art.

2. Methods of Using Cell Cycle Genes, Polynucleotide and Polypeptide Sequences

The present invention provides methods of using plant cell cycle genes and conservative variants thereof. The invention includes methods and constructs for altering expression of plant cell cycle genes and/or gene products for purposes including, but not limited to (i) investigating function during the cell cycle and ultimate effect on plant phenotype and (ii) to effect a change in plant phenotype. For example, the invention includes methods and tools for modifying wood quality, fiber development, cell wall polysaccharide content, fruit ripening, and plant growth and yield by altering expression of one or more plant cell cycle genes.

The invention comprises methods of altering the expression of any of the cell cycle genes and variants discussed above. Thus, for example, the invention comprises altering expression of a cell cycle gene present in the genome of a wild-type plant of a species of Eucalyptus or Pinus. In one embodiment, the cell cycle gene comprises a nucleotide sequence selected from SEQ ID NOs: 1-237, from the subset thereof comprising SEQ ID NOs: SEQ ID NOs: 1-12, 14-58, 60-62, 64-70, 72-75, 77-83, 85-86, 88-91, 93-119, 121-130, 132-148, 150-156, 158-191, 193-207, 209-218, 220-221, 223-231, and 233-237, from the subset thereof comprising SEQ ID NOs: 1-12, 14, 16-26, 30-37, 40-41, 43-76, 78-103, 106, 108-113, 116-121, 124-125, 128-147, 150-152, 154-155, 161-162, 164-172, 174, 177-183, 185-191, 193-197, 200-204, 208-213, and 215-234, from the subset thereof comprising SEQ ID NOs: 1-12, 14, 16-26, 30-37, 40-41, 43-58, 60-62, 64-70, 72-75, 78-83, 85-86, 88-91, 93-103, 106, 108-113, 116-119, 121, 124-125, 128-130, 132-147, 150-152, 154-155, 161-162, 164-172, 174, 177-183, 185-191, 193-197, 200-204, 209-213, 215-218, 220-221, 223-231, and 233-234, or the conservative variants thereof, as discussed above.

Techniques which can be employed in accordance with the present invention to alter gene expression, include, but are not limited to: (i) over-expressing a gene product, (ii) disrupting a gene's transcript, such as disrupting a gene's mRNA transcript; (iii) disrupting the function of a polypeptide encoded by a gene, or (iv) disrupting the gene itself Over-expression of a gene product, the use of antisense RNAs, ribozymes, and the use of double-stranded RNA interference (dsRNAi) are valuable techniques for discovering the functional effects of a gene and for generating plants with a phenotype that is different from a wild-type plant of the same species.

Over-expression of a target gene often is accomplished by cloning the gene or cDNA into an expression vector and introducing the vector into recipient cells. Alternatively, over-expression can be accomplished by introducing exogenous promoters into cells to drive expression of genes residing in the genome. The effect of over-expression of a given gene on cell function, biochemical and/or physiological properties can then be evaluated by comparing plants transformed to over-express the gene to plants that have not been transformed to over-express the gene.

Antisense RNA, ribozyme, and dsRNAi technologies typically target RNA transcripts of genes, usually mRNA. Antisense RNA technology involves expressing in, or introducing into, a cell an RNA molecule (or RNA derivative) that is complementary to, or antisense to, sequences found in a particular mRNA in a cell. By associating with the mRNA, the antisense RNA can inhibit translation of the encoded gene product. The use of antisense technology to reduce or inhibit the expression of specific plant genes has been described, for example in European Patent Publication No. 271988, Smith et al., Nature, 334:724-726 (1988); Smith et. al., Plant Mol. Biol., 14:369-379 (1990)).

A ribozyme is an RNA that has both a catalytic domain and a sequence that is complementary to a particular mRNA. The ribozyme functions by associating with the mRNA (through the complementary domain of the ribozyme) and then cleaving (degrading) the message using the catalytic domain.

RNA interference (RNAi) involves a post-transcriptional gene silencing (PTGS) regulatory process, in which the steady-state level of a specific mRNA is reduced by sequence-specific degradation of the transcribed, usually fully processed mRNA without an alteration in the rate of de novo transcription of the target gene itself. The RNAi technique is discussed, for example, in Elibashir, et al., Methods Enzymol. 26: 199 (2002); McManus & Sharp, Nature Rev. Genetics 3: 737 (2002); PCT application WO 01/75164; Martinez et al., Cell 110: 563 (2002); Elbashir et al., supra; Lagos-Quintana et al., Curr. Biol. 12: 735 (2002); Tuschl et al., Nat. Biotechnol. 20:446 (2002); Tuschl, Chembiochem. 2: 239 (2001); Harborth et al., J. Cell Sci. 114: 4557 (2001); et al., EMBO J. 20:6877 (2001); Lagos-Quintana et al., Science. 294: 8538 (2001); Hutvagner et al., loc cit, 834; Elbashir et al., Nature. 411: 494 (2001).

The present invention provides a DNA construct comprising at least one polynucleotide of SEQ ID NOs: 1-235 or conservative variants thereof, such as the conservative variants discussed above. Any method known in the art can be used to generate the DNA constructs of the present invention. See, e.g. Sambrook et al., supra.

The invention includes DNA constructs that optionally comprise a promoter. Any suitable promoter known in the art can be used. A promoter is a nucleic acid, preferably DNA, that binds RNA polymerase and/or other transcription regulatory elements. As with any promoter, the promoters of the invention facilitate or control the transcription of DNA or RNA to generate an mRNA molecule from a nucleic acid molecule that is operably linked to the promoter. The RNA can encode a protein or polypeptide or can encode an antisense RNA molecule or a molecule useful in RNAi. Promoters useful in the invention include constitutive promoters, inducible promoters, temporally regulated promoters and tissue-preferred promoters.

Examples of useful constitutive plant promoters include: the cauliflower mosaic virus (CaMV) 35S promoter, which confers constitutive, high-level expression in most plant tissues (Odel et al. Nature 313:810(1985)); the nopaline synthase promoter (An et al. Plant Physiol. 88:547 (1988)); and the octopine synthase promoter (Fromm et al., Plant Cell 1: 977 (1989)). It should be noted that, although the CaMV 35S promoter is commonly referred to as a constitutive promoter, some tissue preference can be seen. The use of CaMV 35S is envisioned by the present invention, regardless of any tissue preference which may be exhibited during use in the present invention.

Inducible promoters regulate gene expression in response to environmental, hormonal, or chemical signals. Examples of hormone inducible promoters include auxin-inducible promoters (Baumann et al. Plant Cell 11:323-334(1999)), cytokinin-inducible promoters (Guevara-Garcia, Plant Mol. Biol. 38:743-753(1998)), and gibberellin-responsive promoters (Shi et al. Plant Mol. Biol. 38:1053-1060(1998)). Additionally, promoters responsive to heat, light, wounding, pathogen resistance, and chemicals such as methyl jasmonate or salicylic acid, can be used in the DNA constructs and methods of the present invention.

Tissue-preferred promoters allow for preferred expression of polynucleotides of the invention in certain plant tissue. Tissue-preferred promoters are also useful for directing the expression of antisense RNA or siRNA in certain plant tissues, which can be useful for inhibiting or completely blocking the expression of targeted genes as discussed above. As used herein, vascular plant tissue refers to xylem, phloem or vascular cambium tissue. Other preferred tissue includes apical meristem, root, seed, and flower. In one aspect, the tissue-preferred promoters of the invention are either “xylem-preferred,” “cambium-preferred” or “phloem-preferred,” and preferentially direct expression of an operably linked nucleic acid sequence in the xylem, cambium or phloem, respectively. In another aspect, the DNA constructs of the invention comprise promoters that are tissue-specific for xylem, cambium or phloem, wherein the promoters are only active in the xylem, cambium or phloem.

A vascular-preferred promoter is preferentially active in any of the xylem, phloem or cambium tissues, or in at least two of the three tissue types. A vascular-specific promoter is specifically active in any of the xylem, phloem or cambium, or in at least two of the three. In other words, the promoters are only active in the xylem, cambium or phloem tissue of plants. Note, however, that because of solute transport in plants, a product that is specifically or preferentially expressed in a tissue may be found elsewhere in the plant after expression has occurred.

In another embodiment, the promoter is under temporal regulation, wherein the ability of the promoter to initiate expression is linked to factors such as the stage of the cell cycle or the stage of plant development. For example, the promoter of a cyclin D2 gene may be expressed only during the G1 and early S-phase, and the promoters of particular cyclin genes may be expressed only within the primary vascular poles of the developing seedling.

Additionally, the promoters of particular cell cycle genes may be expressed only within the cambium in developing secondary vasculature. Within the cambium, particular cell cycle gene promoters may be expressed exclusively in the stem or in the root. Moreover, the cell cycle promoters may be expressed only in the spring (for early wood formation) or only in the summer.

A promoter may be operably linked to the polynucleotide. As used in this context, operably linked refers to linking a polynucleotide encoding a structural gene to a promoter such that the promoter controls transcription of the structural gene. If the desired polynucleotide comprises a sequence encoding a protein product, the coding region can be operably linked to regulatory elements, such as to a promoter and a terminator, that bring about expression of an associated messenger RNA transcript and/or a protein product encoded by the desired polynucleotide. In this instance, the polynucleotide is operably linked in the 5′- to 3′-orientation to a promoter and, optionally, a terminator sequence.

Alternatively, the invention provides DNA constructs comprising a polynucleotide in an “antisense” orientation, the transcription of which produces nucleic acids that can form secondary structures that affect expression of an endogenous cell cycle gene in the plant cell. In another variation, the DNA construct may comprise a polynucleotide that yields a double-stranded RNA product upon transcription that initiates RNA interference of a cell cycle gene with which the polynucleotide is associated. A polynucleotide of the present invention can be positioned within a t-DNA, such that the left and right t-DNA border sequences flank or are on either side of the polynucleotide.

It should be understood that the invention includes DNA constructs comprising one or more of any of the polynucleotides discussed above. Thus, for example, a construct may comprise a t-DNA comprising one, two, three, four, five, six, seven, eight, nine, ten, or more polynucleotides.

The invention also includes DNA constructs comprising a promoter that includes one or more regulatory elements. Alternatively, the invention includes DNA constructs comprising a regulatory element that is separate from a promoter. Regulatory elements confer a number of important characteristics upon a promoter region. Some elements bind transcription factors that enhance the rate of transcription of the operably linked nucleic acid. Other elements bind repressors that inhibit transcription activity. The effect of transcription factors on promoter activity can determine whether the promoter activity is high or low, i.e. whether the promoter is “strong” or “weak.”

A DNA construct of the invention can include a nucleotide sequence that serves as a selectable marker useful in identifying and selecting transformed plant cells or plants. Examples of such markers include, but are not limited to, a neomycin phosphotransferase (nptII) gene (Potrykus et al., Mol. Gen. Genet. 199:183-188 (1985)), which confers kanamycin resistance. Cells expressing the nptII gene can be selected using an appropriate antibiotic such as kanamycin or G418. Other commonly used selectable markers include a mutant EPSP synthase gene (Hinchee et al., Bio/Technology 6:915-922 (1988)), which confers glyphosate resistance; and a mutant acetolactate synthase gene (ALS), which confers imidazolinone or sulphonylurea resistance (European Patent Application 154,204, 1985).

The present invention also includes vectors comprising the DNA constructs discussed above. The vectors can include an origin of replication (replicons) for a particular host cell. Various prokaryotic replicons are known to those skilled in the art, and function to direct autonomous replication and maintenance of a recombinant molecule in a prokaryotic host cell.

In one embodiment, the present invention utilizes a pWVR8 vector as described in U.S. Application No. 60/476,222, filed Jun. 6, 2003, or pART27 as described in Gleave, Plant Mol. Biol, 20:1203-27 (1992).

The invention also provides host cells which are transformed with the DNA constructs of the invention. As used herein, a host cell refers to the cell in which a polynucleotide of the invention is expressed. Accordingly, a host cell can be an individual cell, a cell culture or cells that are part of an organism. The host cell can also be a portion of an embryo, endosperm, sperm or egg cell, or a fertilized egg. In one embodiment, the host cell is a plant cell.

The present invention further provides transgenic plants comprising the DNA constructs of the invention. The invention includes transgenic plants that are angiosperms or gymnosperms. The DNA constructs of the present invention can be used to transform a variety of plants, both monocotyledonous (e.g. grasses, corn, grains, oat, wheat and barley), dicotyledonous (e.g., Arabidopsis, tobacco, legumes, alfalfa, oaks, eucalyptus, maple), and Gymnosperms (e.g., Scots pine; see Aronen, Finnish Forest Res. Papers, Vol. 595, 1996), white spruce (Ellis et al., Biotechnology 11:84-89, 1993), and larch (Huang et al., In Vitro Cell 27:201-207, 1991).

The plants also include turfgrass, wheat, maize, rice, sugar beet, potato, tomato, lettuce, carrot, strawberry, cassava, sweet potato, geranium, soybean, and various types of woody plants. Woody plants include trees such as palm oak, pine, maple, fir, apple, fig, plum and acacia. Woody plants also include rose and grape vines.

In one embodiment, the DNA constructs of the invention are used to transform woody plants, i.e., trees or shrubs whose stems live for a number of years and increase in diameter each year by the addition of woody tissue. The invention includes methods of transforming plants including eucalyptus and pine species of significance in the commercial forestry industry such as plants selected from the group consisting of Eucalyptus grandis and its hybrids, and Pinus taeda, as well as the transformed plants and wood and wood pulp derived therefrom. Other examples of suitable plants include those selected from the group consisting of Pinus banksiana, Pinus brutia, Pinus caribaea, Pinus clausa, Pinus contorta, Pinus coulteri, Pinus echinata, Pinus eldarica, Pinus ellioti, Pinus jeffreyi, Pinus lambertiana, Pinus massoniana, Pinus monticola, Pinus nigra, Pinus palustris, Pinus pinaster, Pinus ponderosa, Pinus radiata, Pinus resinosa, Pinus rigida, Pinus serotina, Pinus strobus, Pinus sylvestris, Pinus taeda, Pinus virginiana, Abies amabilis, Abies balsamea, Abies concolor, Abies grandis, Abies lasiocarpa, Abies magnifica, Abies procera, Chamaecyparis lawsoniona, Chamaecyparis nootkatensis, Chamaecyparis thyoides, Juniperus virginiana, Larix decidua, Larix laricina, Larix leptolepis, Larix occidentalis, Larix siberica, Libocedrus decurrens, Picea abies, Picea engelmanni, Picea glauca, Picea mariana, Picea pungens, Picea rubens, Picea sitchensis, Pseudotsuga menziesii, Sequoia gigantea, Sequoia sempervirens, Taxodium distichum, Tsuga canadensis, Tsuga heterophylla, Tsuga mertensiana, Thuja occidentalis, Thuja plicata, Eucalyptus alba, Eucalyptus bancroftii, Eucalyptus botryoides, Eucalyptus bridgesiana, Eucalyptus calophylla, Eucalyptus camaldulensis, Eucalyptus citriodora, Eucalyptus cladocalyx, Eucalyptus coccifera, Eucalyptus curtisii, Eucalyptus dalrympleana, Eucalyptus deglupta, Eucalyptus delagatensis, Eucalyptus diversicolor, Eucalyptus dunnii, Eucalyptus ficifolia, Eucalyptus globulus, Eucalyptus gomphocephala, Eucalyptus gunnii, Eucalyptus henryi, Eucalyptus laevopinea, Eucalyptus macarthurii, Eucalyptus macrorhyncha, Eucalyptus maculata, Eucalyptus marginata, Eucalyptus megacarpa, Eucalyptus melliodora, Eucalyptus nicholii, Eucalyptus nitens, Eucalyptus nova-angelica, Eucalyptus obliqua, Eucalyptus occidentalis, Eucalyptus obtusiflora, Eucalyptus oreades, Eucalyptus pauciflora, Eucalyptus polybractea, Eucalyptus regnans, Eucalyptus resinifera, Eucalyptus robusta, Eucalyptus rudis, Eucalyptus saligna, Eucalyptus sideroxylon, Eucalyptus stuartiana, Eucalyptus tereticornis, Eucalyptus torelliana, Eucalyptus urnigera, Eucalyptus urophylla, Eucalyptus viminalis, Eucalyptus viridis, Eucalyptus wandoo, and Eucalyptus youmanni.

As used herein, the term “plant” also is intended to include the fruit, seeds, flower, strobilus, etc. of the plant. A transformed plant of the current invention can be a direct transfectant, meaning that the DNA construct was introduced directly into the plant, such as through Agrobacterium, or the plant can be the progeny of a transfected plant. The second or subsequent generation plant can be produced by sexual reproduction, i.e., fertilization. Furthermore, the plant can be a gametophyte (haploid stage) or a sporophyte (diploid stage).

As used herein, the term “plant tissue” encompasses any portion of a plant, including plant cells. Plant cells include suspension cultures, callus, embryos, meristematic regions, callus tissue, leaves, roots, shoots, gametophytes, sporophytes, pollen, seeds and microspores. Plant tissues can be grown in liquid or solid culture, or in soil or suitable media in pots, greenhouses or fields. As used herein, “plant tissue” also refers to a clone of a plant, seed, progeny, or propagule, whether generated sexually or asexually, and descendents of any of these, such as cuttings or seeds.

In accordance with one aspect of the invention, a transgenic plant that has been transformed with a DNA construct of the invention has a phenotype that is different from a plant that has not been transformed with the DNA construct.

As used herein, “phenotype” refers to a distinguishing feature or characteristic of a plant which can be altered according to the present invention by integrating one or more DNA constructs of the invention into the genome of at least one plant cell of a plant. The DNA construct can confer a change in the phenotype of a transformed plant by modifying any one or more of a number of genetic, molecular, biochemical, physiological, morphological, or agronomic characteristics or properties of the transformed plant cell or plant as a whole.

In one embodiment, transformation of a plant with a DNA construct of the present invention can yield a phenotype including, but not limited to any one or more of increased drought tolerance, herbicide resistance, reduced or increased height, reduced or increased branching, enhanced cold and frost tolerance, improved vigor, enhanced color, enhanced health and nutritional characteristics, improved storage, enhanced yield, enhanced salt tolerance, enhanced resistance of the wood to decay, enhanced resistance to fungal diseases, altered attractiveness to insect pests, enhanced heavy metal tolerance, increased disease tolerance, increased insect tolerance, increased water-stress tolerance, enhanced sweetness, improved texture, decreased phosphate content, increased germination, increased micronutrient uptake, improved starch composition, improved flower longevity, production of novel resins, and production of novel proteins or peptides.

In another embodiment, the affected phenotype includes one or more of the following traits: propensity to form reaction wood, a reduced period of juvenility, an increased period of juvenility, self-abscising branches, accelerated reproductive development or delayed reproductive development, as compared to a plant of the same species that has not been transformed with the DNA construct.

In a further embodiment, the phenotype that is different in the transgenic plant includes one or more of the following: lignin quality, lignin structure, wood composition, wood appearance, wood density, wood strength, wood stiffness, cellulose polymerization, fiber dimensions, lumen size, other plant components, plant cell division, plant cell development, number of cells per unit area, cell size, cell shape, cell wall composition, rate of wood formation, aesthetic appearance of wood, formation of stem defects, average microfibril angle, width of the S2 cell wall layer, rate of growth, rate of root formation ratio of root to branch vegetative development, leaf area index, and leaf shape.

Phenotype can be assessed by any suitable means. The plants can be evaluated based on their general morphology. Transgenic plants can be observed with the naked eye, can be weighed and their height measured. The plant can be examined by isolating individual layers of plant tissue, namely phloem and cambium, which is further sectioned into meristematic cells, early expansion, late expansion, secondary wall formation, and late cell maturation. See, e.g., Hertzberg, supra. The plants also can be assessed using microscopic analysis or chemical analysis.

Microscopic analysis includes examining cell types, stage of development, and stain uptake by tissues and cells. Fiber morphology, such as fiber wall thickness and microfibril angle of wood pulp fibers can be observed using, for example, microscopic transmission ellipsometry. See Ye and Sundström, Tappi J., 80:181 (1997). Wood strength, density, and grain slope in wet wood and standing trees can be determined by measuring the visible and near infrared spectral data in conjunction with multivariate analysis. See, U.S. Patent Application Publication Nos. 2002/0107644 and 2002/0113212. Lumen size can be measured using scanning electron microscopy. Lignin structure and chemical properties can be observed using nuclear magnetic resonance spectroscopy as described in Marita et al., J. Chem. Soc., Perkin Trans. I 2939 (2001).

The biochemical characteristic of lignin, cellulose, carbohydrates and other plant extracts can be evaluated by any standard analytical method known including spectrophotometry, fluorescence spectroscopy, HPLC, mass spectroscopy, and tissue staining methods.

As used herein, “transformation” refers to a process by which a nucleic acid is inserted into the genome of a plant cell. Such insertion encompasses stable introduction into the plant cell and transmission to progeny. Transformation also refers to transient insertion of a nucleic acid, wherein the resulting transformant transiently expresses the nucleic acid. Transformation can occur under natural or artificial conditions using various methods well known in the art. Transformation can be achieved by any known method for the insertion of nucleic acid sequences into a prokaryotic or eukaryotic host cell, including Agrobacterium-mediated transformation protocols, viral infection, whiskers, electroporation, microinjection, polyethylene glycol-treatment, heat shock, lipofection, and particle bombardment. Transformation can also be accomplished using chloroplast transformation as described in e.g. Svab et al., Proc. Natl Acad. Sci. 87:8526-30 (1990).

In accordance with one embodiment of the invention, transformation in Eucalyptus is performed as described in U.S. Patent Application No. 60/476,222 (supra) which is incorporated herein by reference in its entirety. In accordance with another embodiment, transformation of Pinus is accomplished using the methods described in U.S. Patent Application Publication No. 2002/0100083.

Another aspect of the invention provides methods of obtaining wood and/or making wood pulp from a plant transformed with a DNA construct of the invention. Methods of producing a transgenic plant are provided above and are known in the art. A transformed plant can be cultured or grown under any suitable conditions. For example, pine can be cultured and grown as described in U.S. Patent Application Publication No. 2002/0100083. Eucalyptus can be cultured and grown as in, for example, Rydelius, et al., GROWING EUCALYPTUS FOR PULP AND ENERGY, presented at the Mechanization in Short Rotation, Intensive Culture Forestry Conference, Mobile, Ala., 1994. Wood and wood pulp can be obtained from the plant by any means known in the art.

As noted above, the wood or wood pulp obtained in accordance with this invention may demonstrate improved characteristics including, but not limited to any one or more of lignin composition, lignin structure, wood composition, cellulose polymerization, fiber dimensions, ratio of fibers to other plant components, plant cell division, plant cell development, number of cells per unit area, cell size, cell shape, cell wall composition, rate of wood formation, aesthetic appearance of wood, formation of stem defects, rate of growth, rate of root formation ratio of root to branch vegetative development, leaf area index, and leaf shape include increased or decreased lignin content, increased accessibility of lignin to chemical treatments, improved reactivity of lignin, increased or decreased cellulose content increased dimensional stability, increased tensile strength, increased shear strength, increased compression strength, increased shock resistance, increased stiffness, increased or decreased hardness, decreased spirality, decreased shrinkage, and differences in weight, density, and specific gravity.

B. Expression Profiling of Cell Cycle Genes

The present invention also provides methods and tools for performing expression profiling of cell cycle genes. Expression profiling is useful in determining whether genes are transcribed or translated, comparing transcript levels for particular genes in different tissues, genotyping, estimating DNA copy number, determining identity of descent, measuring mRNA decay rates, identifying protein binding sites, determining subcellular localization of gene products, correlating gene expression to a phenotype or other phenomenon, and determining the effect on other genes of the manipulation of a particular gene. Expression profiling is particularly useful for identifying gene expression in complex, multigenic events. For this reason, expression profiling is useful in correlating gene expression to plant phenotype and formation of plant tissues and the interconnection thereof to the cell cycle.

Only a small fraction of the genes of a plant's genome are expressed at a given time in a given tissue sample, and all of the expressed genes may not affect the plant phenotype. To identify genes capable of affecting a phenotype of interest, the present invention provides methods and tools for determining, for example, a gene expression profile at a given point in the cell cycle, a gene expression profile at a given point in plant development, and a gene expression profile a given tissue sample. The invention also provides methods and tools for identifying cell cycle genes whose expression can be manipulated to alter plant phenotype or to alter the biological activity of cell cycle gene products. In support of these methods, the invention also provides methods and tools that distinguish expression of different genes of the same family.

As used herein, “gene expression” refers to the process of transcription of a DNA sequence into an RNA sequence, followed by translation of the RNA into a protein, which may or may not undergo post-translational processing. Thus, the relationship between cell cycle stage and/or developmental stage and gene expression can be observed by detecting, quantitatively or qualitatively, changes in the level of an RNA or a protein. As used herein, the term “biological activity” includes, but is not limited to, the activity of a protein gene product, including enzyme activity.

The present invention provides oligonucleotides that are useful in these expression profiling methods. Each oligonucleotide is capable of hybridizing under a given set of conditions to a cell cycle gene or gene product. In one aspect of the invention, a plurality of oligonucleotides is provided, wherein each oligonucleotide hybridizes under a given set of conditions to a different cell cycle gene product. Examples of oligonucleotides of the present invention include SEQ ID NOs: 471-697. Each of the oligos of SEQ ID NOs 471-697 hybridizes under standard conditions to a different gene product of one of SEQ ID NOs: 1-237. The oligonucleotides of the invention are useful in determining the expression of one or more cell cycle genes in any of the above-described methods.

1. Cell, Tissue, Nucleic Acid, and Protein Samples

Samples for use in methods of the present invention may be derived from plant tissue. Suitable plant tissues include, but are not limited to, somatic embryos, pollen, leaves, stems, calli, stolons, microtubers, shoots, xylem, male strolbili, pollen cones, vascular tissue, apical meristem, vascular cambium, xylem, root, flower, and seed.

According to the present invention “plant tissue” is used as described previously herein. Plant tissue can be obtained from any of the plants types or species described supra.

In accordance with one aspect of the invention, samples are obtained from plant tissue at different stages of the cell cycle, from plant tissue at different developmental stages, from plant tissue at various times of the year (e.g. spring versus summer), from plant tissues subject to different environmental conditions (e.g. variations in light and temperature) and/or from different types of plant tissue and cells. In accordance with one embodiment, plant tissue is obtained during various stages of maturity and during different seasons of the year. For example, plant tissue can be collected from stem dividing cells, differentiating xylem, early developing wood cells, differentiated spring wood cells, and differentiated summer wood cells. As another example, gene expression in a sample obtained from a plant with developing wood can be compared to gene expression in a sample obtained from a plant which does not have developing wood.

Differentiating xylem includes samples obtained from compression wood, side-wood, and normal vertical xylem. Methods of obtaining samples for expression profiling from pine and eucalyptus are known. See, e.g., Allona et al., Proc. Nat'l Acad. Sci. 95:9693-8 (1998) and Whetton et al., Plant Mol. Biol. 47:275-91, and Kirst et al., INT'L UNION OF FORESTRY RESEARCH ORGANIZATIONS BIENNIAL CONFERENCE, S6.8 (June 2003, Umea, Sweden).

In one embodiment of the invention, gene expression in one type of tissue is compared to gene expression in a different type of tissue or to gene expression in the same type of tissue in a difference stage of development. Gene expression can also be compared in one type of tissue which is sampled at various times during the year (different seasons). For example, gene expression in juvenile secondary xylem can be compared to gene expression in mature secondary xylem. Similarly, gene expression in cambium can be compared to gene expression in xylem. Furthermore, gene expression in apical meristems can be compared to gene expression in cambium.

In an alternative embodiment, differences in gene expression are determined as cells from different tissues advance during the cell cycle. In this method, the cells from the different tissues are synchronized and their gene expression is profiled. Methods of synchronizing the stage of cell cycle in a sample are known. These methods include, e.g., cold acclimation, photoperiod, and aphidicoline. See, e.g., Nagata et al., Int. Rev. Cytol. 132:1-30 (1992), Breyne and Zabeau, Curr. Opin. Plant Biol. 4:136-42, 140 (2001). A sample is obtained during a specific stage of the cell cycle and gene expression in that sample is compared to a sample obtained during a different stage of the cell cycle. For example, tissue can be examined in any of the phases of the cell cycle, such as mitosis, G1, G0, S, and G2. In particular, one can examine the changes in gene expression at the G1, G2, and metaphase checkpoints.

In another embodiment of the invention, a sample is obtained from a plant having a specific phenotype and gene expression in that sample is compared to a sample obtained from a plant of the same species that does not have that phenotype. For example, a sample can be obtained from a plant exhibiting a fast rate of growth and gene expression can be compared with that of a sample obtained from a plant exhibiting a normal or slow rate of growth. Differentially expressed genes identified from such a comparison can be correlated with growth rate and, therefore, useful for manipulating growth rate.

In a further embodiment, a sample is obtained from clonally propagated plants. In one embodiment the clonally propagated plants are of the species Pinus or Eucalyptus. Individual ramets from the same genotype can be sacrificed at different times of year. Thus, for any genotype there can be at least two genetically identical trees sacrificed, early in the season and late in the season. Each of these trees can be divided into juvenile (top) to mature (bottom) samples. Further, tissue samples can be divided into, for example, phloem to xylem, in at least 5 layers of peeling. Each of these samples can be evaluated for phenotype and gene expression. See Entry 196.

Where cellular components may interfere with an analytical technique, such as a hybridization assay, enzyme assay, a ligand binding assay, or a biological activity assay, it may be desirable to isolate the gene products from such cellular components. Gene products, including nucleic acid and amino acid gene products, can be isolated from cell fragments or lysates by any method known in the art.

Nucleic acids used in accordance with the invention can be prepared by any available method or process, or by other processes as they become known in the art. Conventional techniques for isolating nucleic acids are detailed, for example, in Tijssen, LABORATORY TECHNIQUES IN BIOCHEMISTRY AND MOLECULAR BIOLOGY: HYBRIDIZATION WITH NUCLEIC ACID PROBES, chapter 3 (Elsevier Press, 1993), Berger and Kimmel, Methods Enzymol. 152:1 (1987), and GIBCO BRL & LIFE TECHNOLOGIES TRIZOL RNA ISOLATION PROTOCOL, Form No. 3786 (2000). Techniques for preparing nucleic acid samples, and sequencing polynucleotides from pine and eucalyptus are known. See, e.g., Allona et al., supra and Whetton et al., supra, and U.S. Application No. 60/476,222.

A suitable nucleic acid sample can contain any type of nucleic acid derived from the transcript of a cell cycle gene, i.e., RNA or a subsequence thereof or a nucleic acid for which an mRNA transcribed from a cell cycle gene served as a template. Suitable nucleic acids include cDNA reverse-transcribed from a transcript, RNA transcribed from that cDNA, DNA amplified from the cDNA, and RNA transcribed from the amplified DNA. Detection of such products or derived products is indicative of the presence and/or abundance of the transcript in the sample. Thus, suitable samples include, but are not limited to, transcripts of the gene or genes, cDNA reverse-transcribed from the transcript, cRNA transcribed from the cDNA, DNA amplified from the genes, and RNA transcribed from amplified DNA. As used herein, the category of “transcripts” includes but is not limited to pre-mRNA nascent transcripts, transcript processing intermediates, and mature mRNAs and degradation products thereof.

It is not necessary to monitor all types of transcripts to practice the invention. For example, the expression profiling methods of the invention can be conducted by detecting only one type of transcript, such as mature mRNA levels only.

In one aspect of the invention, a chromosomal DNA or cDNA library (comprising, for example, fluorescently labeled cDNA synthesized from total cell mRNA) is prepared for use in hybridization methods according to recognized methods in the art. See Sambrook et al., supra.

In another aspect of the invention, mRNA is amplified using, e.g., the MessageAmp kit (Ambion). In a further aspect, the mRNA is labeled with a detectable label. For example, mRNA can be labeled with a fluorescent chromophore, such as CyDye (Amersham Biosciences).

In some applications, it may be desirable to inhibit or destroy RNase that often is present in homogenates or lysates, before use in hybridization techniques. Methods of inhibiting or destroying nucleases are well known. In one embodiment of the invention, cells or tissues are homogenized in the presence of chaotropic agents to inhibit nuclease. In another embodiment, RNase is inhibited or destroyed by heat treatment, followed by proteinase treatment.

Protein samples can be obtained by any means known in the art. Protein samples useful in the methods of the invention include crude cell lysates and crude tissue homogenates. Alternatively, protein samples can be purified. Various methods of protein purification well known in the art can be found in Marshak et al., STRATEGIES FOR PROTEIN PURIFICATION AND CHARACTERIZATION: A LABORATORY COURSE MANUAL (Cold Spring Harbor Laboratory Press 1996).

2. Detecting Level of Gene Expression

For methods of the invention that comprise detecting a level of gene expression, any method for observing gene expression can be used, without limitation. Such methods include traditional nucleic acid hybridization techniques, polymerase chain reaction (PCR) based methods, and protein determination. The invention includes detection methods that use solid support-based assay formats as well as those that use solution-based assay formats.

Absolute measurements of the expression levels need not be made, although they can be made. The invention includes methods comprising comparisons of differences in expression levels between samples. Comparison of expression levels can be done visually or manually, or can be automated and done by a machine, using for example optical detection means. Subrahmanyam et al., Blood. 97: 2457 (2001); Prashar et al., Methods Enzymol. 303: 258 (1999). Hardware and software for analyzing differential expression of genes are available, and can be used in practicing the present invention. See, e.g., GenStat Software and GeneExpress® GX Explorer™ Training Manual, supra; Baxevanis & Francis-Ouellette, supra.

In accordance with one embodiment of the invention, nucleic acid hybridization techniques are used to observe gene expression. Exemplary hybridization techniques include Northern blotting, Southern blotting, solution hybridization, and S1 nuclease protection assays.

Nucleic acid hybridization typically involves contacting an oligonucleotide probe and a sample comprising nucleic acids under conditions where the probe can form stable hybrid duplexes with its complementary nucleic acid through complementary base pairing. For example, see PCT application WO 99/32660; Berger & Kimmel, Methods Enzymol. 152: 1 (1987). The nucleic acids that do not form hybrid duplexes are then washed away leaving the hybridized nucleic acids to be detected, typically through detection of an attached detectable label. The detectable label can be present on the probe, or on the nucleic acid sample. In one embodiment, the nucleic acids of the sample are detectably labeled polynucleotides representing the mRNA transcripts present in a plant tissue (e.g., a cDNA library). Detectable labels are commonly radioactive or fluorescent labels, but any label capable of detection can be used. Labels can be incorporated by several approached described, for instance, in WO 99/32660, supra. In one aspect RNA can be amplified using the MessageAmp kit (Ambion) with the addition of aminoallyl-UTP as well as free UTP. The aminoallyl groups incorporated into the amplified RNA can be reacted with a fluorescent chromophore, such as CyDye (Amersham Biosciences)

Duplexes of nucleic acids are destabilized by increasing the temperature or decreasing the salt concentration of the buffer containing the nucleic acids. Under low stringency conditions (e.g., low temperature and/or high salt) hybrid duplexes (e.g., DNA:DNA, RNA:RNA or RNA:DNA) will form even where the annealed sequences are not perfectly complementary. Thus, specificity of hybridization is reduced at lower stringency. Conversely, at higher stringency (e.g., higher temperature and/or lower salt and/or in the presence of destabilizing reagents) hybridization tolerates fewer mismatches.

Typically, stringent conditions for short probes (e.g., 10 to 50 nucleotide bases) will be those in which the salt concentration is at least about 0.01 to 1.0 M at pH 7.0 to 8.3 and the temperature is at least about 30° C. Stringent conditions can also be achieved with the addition of destabilizing agents such as formamide.

Under some circumstances, it can be desirable to perform hybridization at conditions of low stringency, e.g., 6×SSPE-T (0.9 M NaCl, 60 mM NaH2PO4, pH 7.6, 6 mM EDTA, 0.005% Triton) at 37° C., to ensure hybridization. Subsequent washes can then be performed at higher stringency (e.g., 1×SSPE-T at 37° C.) to eliminate mismatched hybrid duplexes. Successive washes can be performed at increasingly higher stringency (e.g., down to as low as 0.25×SSPE-T at 37° C. to 50° C.) until a desired level of hybridization specificity is obtained.

In general, standard conditions for hybridization is a compromise between stringency (hybridization specificity) and signal intensity. Thus, in one embodiment of the invention, the hybridized nucleic acids are washed at successively higher stringency conditions and read between each wash. Analysis of the data sets produced in this manner will reveal a wash stringency above which the hybridization pattern is not appreciably altered and which provides adequate signal for the particular oligonucleotide probes of interest. For example, the final wash may be selected as that of the highest stringency that produces consistent results and that provides a signal intensity greater than approximately 10% of the background intensity.

a. Oligonucleotide Probes

Oligonucleotide probes useful in nucleic acid hybridization techniques employed in the present invention are capable of binding to a nucleic acid of complementary sequence through one or more types of chemical bonds, usually through complementary base pairing via hydrogen bond formation. A probe can include natural bases (i.e., A, G, U, C or T) or modified bases (7-deazaguanosine, inosine, etc.). In addition, the nucleotide bases in the probes can be joined by a linkage other than a phosphodiester bond, so long as it does not interfere with hybridization. Thus, probes can be peptide nucleic acids in which the constituent bases are joined by peptide bonds rather than phosphodiester linkages.

Oligonucleotide probes can be prepared by any means known in the art. Probes useful in the present invention are capable of hybridizing to a nucleotide product of cell cycle genes, such as one of SEQ ID NOs: 1-237. Probes useful in the invention can be generated using the nucleotide sequences disclosed in SEQ ID NOs: 1-237. The invention includes oligonucleotide probes having at least a 2, 10,15, 20, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, or 100 nucleotide fragment of a corresponding contiguous sequence of any one of SEQ ID NOs: 1-237. The invention includes oligonucleotides of less than 2, 1, 0.5, 0.1, or 0.05 kb in length. In one embodiment, the oligonucleotide is 60 nucleotides in length.

Oligonucleotide probes can be designed by any means known in the art. See, e.g., Li and Stormo, Bioinformatics 17: 1067-76 (2001). Oligonucleotide probe design can be effected using software. Exemplary software includes ArrayDesigner, GeneScan, and ProbeSelect. Probes complementary to a defined nucleic acid sequence can be synthesized chemically, generated from longer nucleotides using restriction enzymes, or can be obtained using techniques such as polymerase chain reaction (PCR). PCR methods are well known and are described, for example, in Innis et al. eds., PCR PROTOCOLS: A GUIDE TO METHODS AND APPLICATIONS, Academic Press Inc. San Diego, Calif. (1990). The probes can be labeled, for example, with a radioactive, biotinylated, or fluorescent tag. Optimally, the nucleic acids in the sample are labeled and the probes are not labeled. Oligonucleotide probes generated by the above methods can be used in solution or solid support-based methods.

The invention includes oligonucleotide probes that hybridize to a product of the coding region or a 3′ untranslated region (3′ UTR) of a cell cycle gene. In one embodiment, the oligonucleotide probe hybridizes to the 3′UTR of any one of SEQ ID NOs: 1-237. The 3′ UTR is generally a unique region of the gene, even among members of the same family. Therefore, the probes capable of hybridizing to a product of the 3′ UTR can be useful for differentiating the expression of individual genes within a family where the coding region of the genes likely are highly homologous. This allows for the design of oligonucleotide probes to be used as members of a plurality of oligonucleotides, each capable of uniquely binding to a single gene. In another embodiment, the oligonucleotide probe comprises any one of SEQ ID NOs: 471-697. In another embodiment, the oligonucleotide probe consists of any one of SEQ ID NOs:471-697.

b. Oligonucleotide Array Methods

One embodiment of the invention employs two or more oligonucleotide probes in combination to detect a level of expression of one or more cell cycle genes, such as the genes of SEQ ID NOs: 1-237. In one aspect of this embodiment, the level of expression of two or more different genes is detected. The two or more genes may be from the same or different cell cycle gene families discussed above. Each of the two or more oligonucleotides may hybridize to a different one of the genes.

One embodiment of the invention employs two or more oligonucleotide probes, each of which specifically hybridize to a polynucleotide derived from the transcript of a gene provided by SEQ ID NOs: 1-237. Another embodiment employs two or more oligonucleotide probes, at least one of which comprises a nucleic acid sequence of SEQ ID NOs: 471-697. Another embodiment employs two or more oligonucleotide probes, at least one of which consists of SEQ ID NOs: 471-697.

The oligonucleotide probes may comprise from about 5 to about 60, or from about 5 to about 500, nucleotide bases, such as from about 60 to about 100 nucleotide bases, including from about 15 to about 60 nucleotide bases.

One embodiment of the invention uses solid support-based oligonucleotide hybridization methods to detect gene expression. Solid support-based methods suitable for practicing the present invention are widely known and are described, for example, in PCT application WO 95/11755; Huber et al., Anal. Biochem. 299: 24 (2001); Meiyanto et al., Biotechniques. 31: 406 (2001); Relogio et al., Nucleic Acids Res. 30:e51 (2002). Any solid surface to which oligonucleotides can be bound, covalently or non-covalently, can be used. Such solid supports include filters, polyvinyl chloride dishes, silicon or glass based chips, etc.

One embodiment uses oligonucleotide arrays, i.e. microarrays, which can be used to simultaneously observe the expression of a number of genes or gene products. Oligonucleotide arrays comprise two or more oligonucleotide probes provided on a solid support, wherein each probe occupies a unique location on the support. The location of each probe may be predetermined, such that detection of a detectable signal at a given location is indicative of hybridization to an oligonucleotide probe of a known identity. Each predetermined location can contain more than one molecule of a probe, but each molecule within the predetermined location has an identical sequence. Such predetermined locations are termed features. There can be, for example, from 2, 10, 100, 1,000, 2,000 or 5,000 or more of such features on a single solid support. In one embodiment, each oligonucleotide is located at a unique position on an array at least 2, at least 3, at least 4, at least 5, at least 6, or at least 10 times.

Oligonucleotide probe arrays for detecting gene expression can be made and used according to conventional techniques described, for example, in Lockhart et al., Nat'l Biotech. 14: 1675 (1996), McGall et al., Proc. Nat'l Acad. Sci. USA 93: 13555 (1996), and Hughes et al., Nature Biotechnol. 19:342 (2001). A variety of oligonucleotide array designs is suitable for the practice of this invention.

In one embodiment the one or more oligonucleotides include a plurality of oligonucleotides that each hybridize to a different gene expressed in a particular tissue type. For example, the tissue can be developing wood.

In one embodiment, a nucleic acid sample obtained from a plant can be amplified and, optionally labeled with a detectable label. Any method of nucleic acid amplification and any detectable label suitable for such purpose can be used. For example, amplification reactions can be performed using, e.g. Ambion's MessageAmp, which creates “antisense” RNA or “aRNA” (complementary in nucleic acid sequence to the RNA extracted from the sample tissue). The RNA can optionally be labeled using CyDye fluorescent labels. During the amplification step, aaUTP is incorporated into the resulting aRNA. The CyDye fluorescent labels are coupled to the aaUTPs in a non-enzymatic reaction. Subsequent to the amplification and labeling steps, labeled amplified antisense RNAs are precipitated and washed with appropriate buffer, and then assayed for purity. For example, purity can be assay using a NanoDrop spectrophotometer. The nucleic acid sample is then contacted with an oligonucleotide array having, attached to a solid substrate (a “microarray slide”), oligonucleotide sample probes capable of hybridizing to nucleic acids of interest which may be present in the sample. The step of contacting is performed under conditions where hybridization can occur between the nucleic acids of interest and the oligonucleotide probes present on the array. The array is then washed to remove non-specifically bound nucleic acids and the signals from the labeled molecules that remain hybridized to oligonucleotide probes on the solid substrate are detected. The step of detection can be accomplished using any method appropriate to the type of label used. For example, the step of detecting can accomplished using a laser scanner and detector. For example, on can use and Axon scanner which optionally uses GenePix Pro software to analyze the position of the signal on the microarray slide.

Data from one or more microarray slides can analyzed by any appropriate method known in the art.

Oligonucleotide probes used in the methods of the present invention, including microarray techniques, can be generated using PCR. PCR primers used in generating the probes are chosen, for example, based on the sequences of SEQ ID NOs:1-237, to result in amplification of unique fragments of the cell cycle genes (i.e., fragments that hybridize to only one polynucleotide of any one of SEQ ID NOs: 1-237 under standard hybridization conditions). Computer programs are useful in the design of primers with the required specificity and optimal hybridization properties. For example, Li and Stormo, supra at 1075, discuss a method of probe selection using ProbeSelect which selects an optimum oligonucleotide probe based on the entire gene sequence as well as other gene sequences to be probed at the same time.

In one embodiment, oligonucleotide control probes also are used. Exemplary control probes can fall into at least one of three categories referred to herein as (1) normalization controls, (2) expression level controls and (3) negative controls. In microarray methods, one or more of these control probes may be provided on the array with the inventive cell cycle gene-related oligonucleotides.

Normalization controls correct for dye biases, tissue biases, dust, slide irregularities, malformed slide spots, etc. Normalization controls are oligonucleotide or other nucleic acid probes that are complementary to labeled reference oligonucleotides or other nucleic acid sequences that are added to the nucleic acid sample to be screened. The signals obtained from the normalization controls, after hybridization, provide a control for variations in hybridization conditions, label intensity, reading efficiency and other factors that can cause the signal of a perfect hybridization to vary between arrays. In one embodiment, signals (e.g., fluorescence intensity or radioactivity) read from all other probes used in the method are divided by the signal from the control probes, thereby normalizing the measurements.

Virtually any probe can serve as a normalization control. Hybridization efficiency varies, however, with base composition and probe length. Preferred normalization probes are selected to reflect the average length of the other probes being used, but they also can be selected to cover a range of lengths. Further, the normalization control(s) can be selected to reflect the average base composition of the other probes being used. In one embodiment, only one or a few normalization probes are used, and they are selected such that they hybridize well (i.e., without forming secondary structures) and do not match any test probes. In one embodiment, the normalization controls are mammalian genes.

Expression level controls probes hybridize specifically with constitutively expressed genes present in the biological sample. Virtually any constitutively expressed gene provides a suitable target for expression level control probes. Typically, expression level control probes have sequences complementary to subsequences of constitutively expressed “housekeeping genes” including, but not limited to certain photosynthesis genes.

“Negative control” probes are not complementary to any of the test oligonucleotides (i.e., the inventive cell cycle gene-related oligonucleotides), normalization controls, or expression controls. In one embodiment, the negative control is a mammalian gene which is not complementary to any other sequence in the sample.

The terms “background” and “background signal intensity” refer to hybridization signals resulting from non-specific binding or other interactions between the labeled target nucleic acids (i.e., mRNA present in the biological sample) and components of the oligonucleotide array. Background signals also can be produced by intrinsic fluorescence of the array components themselves.

A single background signal can be calculated for the entire array, or a different background signal can be calculated for each target nucleic acid. In a one embodiment, background is calculated as the average hybridization signal intensity for the lowest 5 to 10 percent of the oligonucleotide probes being used, or, where a different background signal is calculated for each target gene, for the lowest 5 to 10 percent of the probes for each gene. Where the oligonucleotide probes corresponding to a particular cell cycle gene hybridize well and, hence, appear to bind specifically to a target sequence, they should not be used in a background signal calculation. Alternatively, background can be calculated as the average hybridization signal intensity produced by hybridization to probes that are not complementary to any sequence found in the sample (e.g., probes directed to nucleic acids of the opposite sense or to genes not found in the sample). In microarray methods, background can be calculated as the average signal intensity produced by regions of the array that lack any oligonucleotides probes at all.

c. PCR-Based Methods

In another embodiment, PCR-based methods are used to detect gene expression. These methods include reverse-transcriptase-mediated polymerase chain reaction (RT-PCR) including real-time and endpoint quantitative reverse-transcriptase-mediated polymerase chain reaction (Q-RTPCR). These methods are well known in the art. For example, methods of quantitative PCR can be carried out using kits and methods that are commercially available from, for example, Applied BioSystems and Stratagene®. See also Kochanowski, QUANTITATIVE PCR PROTOCOLS (Humana Press, 1999); Innis et al., supra.; Vandesompele et al., Genome Biol. 3: RESEARCH0034 (2002); Stein, Cell Mol. Life Sci. 59: 1235 (2002).

Gene expression can also be observed in solution using Q-RTPCR. Q-RTPCR relies on detection of a fluorescent signal produced proportionally during amplification of a PCR product. See Innis et al., supra. Like the traditional PCR method, this technique employs PCR oligonucleotide primers, typically 15-30 bases long, that hybridize to opposite strands and regions flanking the DNA region of interest. Additionally, a probe (e.g., TaqMan®, Applied Biosystems) is designed to hybridize to the target sequence between the forward and reverse primers traditionally used in the PCR technique. The probe is labeled at the 5′ end with a reporter fluorophore, such as 6-carboxyfluorescein (6-FAM) and a quencher fluorophore like 6-carboxy-tetramethyl-rhodamine (TAMRA). As long as the probe is intact, fluorescent energy transfer occurs which results in the absorbance of the fluorescence emission of the reporter fluorophore by the quenching fluorophore. As Taq polymerase extends the primer, however, the intrinsic 5′ to 3′ nuclease activity of Taq degrades the probe, releasing the reporter fluorophore. The increase in the fluorescence signal detected during the amplification cycle is proportional to the amount of product generated in each cycle.

The forward and reverse amplification primers and internal hybridization probe is designed to hybridize specifically and uniquely with one nucleotide derived from the transcript of a target gene. In one embodiment, the selection criteria for primer and probe sequences incorporates constraints regarding nucleotide content and size to accommodate TaqMan® requirements.

SYBR Green® can be used as a probe-less Q-RTPCR alternative to the Taqman®-type assay, discussed above. ABI PRISM® 7900 SEQUENCE DETECTION SYSTEM USER GUIDE APPLIED BIOSYSTEMS, chap. 1-8, App. A-F. (2002).

A device measures changes in fluorescence emission intensity during PCR amplification. The measurement is done in “real time,” that is, as the amplification product accumulates in the reaction. Other methods can be used to measure changes in fluorescence resulting from probe digestion. For example, fluorescence polarization can distinguish between large and small molecules based on molecular tumbling (see U.S. Pat. No. 5,593,867).

d. Protein Detection Methods

Proteins can be observed by any means known in the art, including immunological methods, enzyme assays and protein array/proteomics techniques.

Measurement of the translational state can be performed according to several protein methods. For example, whole genome monitoring of protein—the “proteome”—can be carried out by constructing a microarray in which binding sites comprise immobilized, preferably monoclonal, antibodies specific to a plurality of proteins having an amino acid sequence of any of SEQ ID NOs: 261-497 or proteins encoded by the genes of SEQ ID NOs:1-237 or conservative variants thereof. See Wildt et al., Nature Biotechnol. 18: 989 (2000). Methods for making polyclonal and monoclonal antibodies are well known, as described, for instance, in Harlow & Lane, ANTIBODIES: A LABORATORY MANUAL (Cold Spring Harbor Laboratory Press, 1988).

Alternatively, proteins can be separated by two-dimensional gel electrophoresis systems. Two-dimensional gel electrophoresis is well-known in the art and typically involves isoelectric focusing along a first dimension followed by SDS-PAGE electrophoresis along a second dimension. See, e.g., Hames et al, GEL ELECTROPHORESIS OF PROTEINS: A PRACTICAL APPROACH (IRL Press, 1990). The resulting electropherograms can be analyzed by numerous techniques, including mass spectrometric techniques, western blotting and immunoblot analysis using polyclonal and monoclonal antibodies, and internal and N-terminal micro-sequencing.

3. Correlating Gene Expression to Phenotype and Tissue Development

As discussed above, the invention provides methods and tools to correlate gene expression to plant phenotype. Gene expression may be examined in a plant having a phenotype of interest and compared to a plant that does not have the phenotype or has a different phenotype. Such a phenotype includes, but is not limited to, increased drought tolerance, herbicide resistance, reduced or increased height, reduced or increased branching, enhanced cold and frost tolerance, improved vigor, enhanced color, enhanced health and nutritional characteristics, improved storage, enhanced yield, enhanced salt tolerance, enhanced resistance of the wood to decay, enhanced resistance to fungal diseases, altered attractiveness to insect pests, enhanced heavy metal tolerance, increased disease tolerance, increased insect tolerance, increased water-stress tolerance, enhanced sweetness, improved texture, decreased phosphate content, increased germination, increased micronutrient uptake, improved starch composition, improved flower longevity, production of novel resins, and production of novel proteins or peptides.

In another embodiment, the phenotype includes one or more of the following traits: propensity to form reaction wood, a reduced period of juvenility, an increased period of juvenility, self-abscising branches, accelerated reproductive development or delayed reproductive development.

In a further embodiment, the phenotype that is differs in the plants compares includes one or more of the following: lignin quality, lignin structure, wood composition, wood appearance, wood density, wood strength, wood stiffness, cellulose polymerization, fiber dimensions, lumen size, other plant components, plant cell division, plant cell development, number of cells per unit area, cell size, cell shape, cell wall composition, rate of wood formation, aesthetic appearance of wood, formation of stem defects, average microfibril angle, width of the S2 cell wall layer, rate of growth, rate of root formation ratio of root to branch vegetative development, leaf area index, and leaf shape.

Phenotype can be assessed by any suitable means as discussed above.

In a further embodiment, gene expression can be correlated to a given point in the cell cycle, a given point in plant development, and in a given tissue sample. Plant tissue can be examined at different stages of the cell cycle, from plant tissue at different developmental stages, from plant tissue at various times of the year (e.g. spring versus summer), from plant tissues subject to different environmental conditions (e.g. variations in light and temperature) and/or from different types of plant tissue and cells. In accordance with one embodiment, plant tissue is obtained during various stages of maturity and during different seasons of the year. For example, plant tissue can be collected from stem dividing cells, differentiating xylem, early developing wood cells, differentiated spring wood cells, differentiated summer wood cells.

It will be apparent to those skilled in the art that various modifications and variations can be made in the methods and compositions of the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention cover the modifications and variations of this invention provided they come within the scope of the appended claims and their equivalents.

The following examples are given to illustrate the present invention. It should be understood, however, that the invention is not to be limited to the specific conditions or details described in these examples. Throughout the specification, any and all references to a publicly available document, including a U.S. patent, are specifically incorporated by reference.

Examples

Example 1

Example 1 illustrates a procedure for RNA extraction and purification, which is particularly useful for RNA obtained from conifer needle, xylem, cambium, and phloem.

Tissue is obtained from conifer needle, xylem, cambium or phloem. The tissue is frozen in liquid nitrogen and ground. The total RNA is extracted using Concert Plant RNA reagent (Invitrogen). The resulting RNA sample is extracted into phenol:chloroform and treated with DNase. The RNA is then incubated at 65° C. for 2 minutes followed by centrifugation at 4° C. for 30 minutes. Following centrifugation, the RNA is extracted into phenol at least 10 times to remove contaminants.

The RNA is further cleaned using RNeasy columns (Qiagen). The purified RNA is quantified using RiboGreen reagent (Molecular Probes) and purity assessed by gel electrophoresis.

RNA is then amplified using MessageAmp (Ambion). Aminoallyl-UTP and free UTP are added to the in vitro transcription of the purified RNA at a ratio of 4:1 aminoallyl-UTP-to-UTP. The aminoallyl-UTP is incorporated into the new RNA strand as it is transcribed. The amino-allyl group is then reacted with Cy dyes to attach the colorimetric label to the resulting amplified RNA using the Amersham procedure modified for use with RNA. Unincorporated dye is removed by ethanol precipitation. The labeled RNA is quantified spectrophotometrically (NanoDrop). The labeled RNA is fragmented by heating to 95° C. as described in Hughes et al., Nature Biotechnol. 19:342 (2001).

Example 2

Example 2 illustrates how cell cycle genes important for wood development in Pinus radiata can be determined and how oligonucleotides which uniquely bind to those genes can be designed and synthesized for use on a microarray.

Pine trees of the species Pinus radiata are grown under natural light conditions. Tissue samples are prepared as described in, e.g., Sterky et al., Proc. Nat'l Acad. Sci. 95:13330 (1998). Specifically, tissue samples are collected from woody trees having a height of 5 meters. Tissue samples of the woody trees are prepared by taking tangential sections through the cambial region of the stem. The stems are sectioned horizontally into sections ranging from juvenile (top) to mature (bottom). The stem sections separated by stage of development are further separated into 5 layers by peeling into sections of phloem, differentiating phloem, cambium, differentiating xylem, developing xylem, and mature xylem. Tissue samples, including leaves, buds, shoots, and roots are also prepared from seedlings of the species Pinus radiata.

RNA is isolated and ESTs generated as described in Example 1 or Sterky et al., supra. The nucleic acid sequences of ESTs derived from samples containing developing wood are compared with nucleic acid sequences of genes known to be involved in the plant cell cycle. ESTs from samples that do not contain developing wood are also compared with sequences of genes known to be involved in the plant cell cycle. An in silico hybridization analysis is performed using BLAST (NCBI). Sequences from among the known cell cycle genes that show hybridization in silico to ESTs made from samples containing developing wood, but that do not hybridize to ESTs from samples not containing developing wood are selected for further examination.

cDNA clones containing sequences that hybridize to the genes showing wood-preferred expression are selected from cDNA libraries using techniques well known in the art of molecular biology. Using the sequence information, oligonucleotides are designed such that each oligonucleotide is specific for only one cDNA sequence in the library. The oligonucleotide sequences are provided in Table 14. 60-mer oligonucleotide probes are designed using the method of Li and Stormo, supra or using software such as ArrayDesigner, GeneScan, and ProbeSelect.

The oligonucleotides are then synthesized in situ described in Hughes et al., Nature Biotechnol. 19:324 (2002) or as described in Kane et al., Nucleic Acids Res. 28:4552 (2000) and affixed to an activated glass slide (Sigma-Genosis, The Woodlands, Tex.) using a 5′ amino linker. The position of each oligonucleotide on the slide is known.

Example 3

Example 3 illustrates how cell cycle genes important for wood development in Eucalyptus grandis can be determined and how oligonucleotides which uniquely bind to those genes can be designed and synthesized for use on a microarray.

Eucalyptus trees of the species Eucalyptus grandis are grown under natural light conditions. Tissue samples are prepared as described in, e.g., Sterky et al., Proc. Nat'l Acad. Sci. 95:13330 (1998). Specifically, tissue samples are collected from woody trees having a height of 5 meters. Tissue samples of the woody trees are prepared by taking tangential sections through the cambial region of the stem. The stems are sectioned horizontally into sections ranging from juvenile (top) to mature (bottom). The stem sections separated by stage of development are further separated into 5 layers by peeling into sections of phloem, differentiating phloem, cambium, differentiating xylem, developing xylem, and mature xylem. Tissue samples, including leaves, buds, shoots, and roots are also prepared from seedlings of the species Pinus radiata.

RNA is isolated and ESTs generated as described in Example 1 or Sterky et al., supra. The nucleic acid sequences of ESTs derived from samples containing developing wood are compared with nucleic acid sequences of genes known to be involved in the plant cell cycle. ESTs from samples that do not contain developing wood are also compared with sequences of genes known to be involved in the plant cell cycle. An in silico hybridization analysis is performed as described in, for example, Audic and Claverie, Genome Res. 7:986 (1997). Sequences from among the known cell cycle genes that show hybridization in silico to ESTs made from samples containing developing wood, but do not hybridize to ESTs from samples not containing developing wood are selected for further examination.

cDNA clones containing sequences that hybridize to the genes showing wood-preferred expression are selected from cDNA libraries using techniques well known in the art of molecular biology. Using the sequence information, oligonucleotides are designed such that each oligonucleotide is specific for only one cDNA sequence in the library. The oligonucleotide sequences are provided in Table 14. 60-mer oligonucleotide probes are designed using the method of Li and Stormo, supra or using software such as ArrayDesigner, GeneScan, and ProbeSelect.

The oligonucleotides are then synthesized in situ described in Hughes et al., Nature Biotechnol. 19:324 (2002) or as described in Kane et al., Nucleic Acids Res. 28:4552 (2000) and affixed to an activated glass slide (Sigma-Genosus, The Woodlands, Tex.) using a 5′ amino linker. The position of each oligonucleotide on the slide is known.

Example 4

Example 4 illustrates how to detect expression of Pinus radiata cell cycle genes which are important in wood formation using an oligonucleotide microarray prepared as in Example 2. This is an example of a balanced incomplete block designed experiment carried out using aRNA samples prepared from mature-phase phloem (P), cambium (C), expanding xylem found in a layer below the cambium (X1) and differentiating, lignifying xylem cells found deeper in the same growth ring (X2). In this example, cell cycle gene expression is compared among the four samples, namely P, C, X1, and X2.

In the summer, plants of the species Pinus radiata are felled and the bark of the main stem is immediately pulled gently away to reveal the phloem and xylem. The phloem and xylem are then peeled with a scalpel into separate containers of liquid nitrogen. Needles (leaves) and buds from the trees are also harvested with a scalpel into separate containers of liquid nitrogen. RNA is subsequently isolated from the frozen tissue samples as described in Example 1. Equal microgram quantities of total RNA are purified from each sample using RNeasy Mini columns (Qiagen, Valencia, Calif.) according to the manufacturers instructions.

Amplification reactions are carried out for each of the P, C, X1, and X2 tissue samples. Amplification reactions are performed using Ambion's MessageAmp kit, a T7-based amplification procedure, following the manufacturer's instructions, except that labeled aaUTP is added to the reagent mix during in the amplification step. aaUTP is incorporated into the resulting antisense RNA formed during this step. CyDye fluorescent labels are coupled to the aaUTPs in a non-enzymatic reaction as described in Example 1. Labeled amplified antisense RNAs are precipitated and washed, and then assayed for purity using a NanoDrop spectrophotometer. These labeled antisense RNAs, corresponding to the RNA isolated from the P, C, X1, and X2 tissue samples, constitute the sample nucleic acids, which are referred to as the P, C, X1, and X2 samples.

Normalization control samples of known nucleic acids are added to each sample in a dilution series of 500, 200, 100, 50, 25 and 10 pg/μl for quantitation of the signals. Positive controls corresponding to specific genes showing expression in all tissues of pine, such as housekeeping genes, are also added to the plant sample.

Each of four microarray slides is incubated with 125 μL of a P, C, X1 or X2 sample under a coverslip at 42° C. for 16-18 hours. The arrays are washed in 1×SSC, 0.1% SDS for 10 minutes and then in 0.1×SSC, 0.1% SDS for 10 minutes and the allowed to dry.

The array slides are scanned using an Axon laser scanner and analyzed using GenePix Pro software. Data from the microarray slides are subjected to microarray data analysis using GenStat SAS or Spotfire software. Outliers are removed and ratiometric data for each of the datasets are normalized using a global normalization which employs a cubic spline fit applied to correct for differential dye bias and spatial effects. A second transformation is performed to fit control signal ratios to a mean log2=0 (i.e. 1:1 ratio). Normalized data are then subjected to a variance analysis.

Mean signal intensity for each signal at any given position on the microarray slide is determined for each of three of P, C, X1, and X2 sample microarray slides. This mean signal/probe position is compared to the signal at the same position on sample slide which was not used for calculating the mean. For example, a mean signal at a given position is determined for P, C, and X1 and the signal at that position in the X2 microarray slide is compared to the P, C, and X1 mean signal value.

Table 1 shows genes having greater than doubled signal with any one sample as compared to the mean signal of the other three samples.

TABLE 1
Gene PvCX12 PvX12 CvX12
WD40 repeat protein A −1.24 −0.88 −1.07
CDC2 −1.09 −0.78 −0.92
CYCLIN −1.08 −1 −0.26
WD-40 repeat protein B −1.01 −0.87 −0.42
CDC2 −0.83 −0.49 −1.01
P = Phloem
C = Cambium
X1 = xylem layer-1
X2 = xylem layer-2
PvCX12 = Ratio of the signal for Phloem target versus mean signal for Cambium, Xylem1, and Xylem2 targets

The data shows that WD40 repeat protein A encodes a WD40 repeat protein is less highly expressed in cambium than in developing xylem, while WD40 repeat protein B encodes a WD40 repeat protein that is more highly expressed in phloem than in the other tissues.

Signal data are then verified with RT-PCR to confirm gene expression in the target tissue of the genes corresponding to the unique oligonucleotides in the probe.

Example 5

Example 5 demonstrates how one can correlate cell cycle gene expression with agronomically important wood phenotypes such as density, stiffness, strength, distance between branches, and spiral grain.

Mature clonally propagated pine trees are selected from among the progeny of known parent trees for superior growth characteristics and resistance to important fungal diseases. The bark is removed from a tangential section and the trees are examined for average wood density in the fifth annual ring at breast height, stiffness and strength of the wood, and spiral grain. The trees are also characterized by their height, mean distance between major branches, crown size, and forking.

To obtain seedling families that are segregating for major genes that affect density, stiffness, strength, distance between branches, spiral grain and other characteristics that may be linked to any of the genes affecting these characteristics, trees lacking common parents are chosen for specific crosses on the criterion that they exhibit the widest variation from each other with respect to the density, stiffness, strength, distance between branches, and spiral grain criteria. Thus, pollen from a plus tree exhibiting high density, low mean distance between major branches, and high spiral grain is used to pollinate cones from the unrelated plus tree among the selections exhibiting the lowest density, highest mean distance between major branches, and lowest spiral grain. It is useful to note that “plus trees” are crossed such that pollen from a plus tree exhibiting high density are used to pollinate developing cones from another plus tree exhibiting high density, for example, and pollen from a tree exhibiting low mean distance between major branches would be used to pollinate developing cones from another plus tree exhibiting low mean distance between major branches.

Seeds are collected from these controlled pollinations and grown such that the parental identity is maintained for each seed and used for vegetative propagation such that each genotype is represented by multiple ramets. Vegetative propagation is accomplished using micropropagation, hedging, or fascicle cuttings. Some ramets of each genotype are stored while vegetative propagules of each genotype are grown to sufficient size for establishment of a field planting. The genotypes are arrayed in a replicated design and grown under field conditions where the daily temperature and rainfall are measured and recorded.

The trees are measured at various ages to determine the expression and segregation of density, stiffness, strength, distance between branches, spiral grain, and any other observable characteristics that may be linked to any of the genes affecting these characteristics. Samples are harvested for characterization of cellulose content, lignin content, cellulose microfibril angle, density, strength, stiffness, tracheid morphology, ring width, and the like. Samples are also examined for gene expression as described in Example 4. Ramets of each genotype are compared to ramets of the same genotype at different ages to establish age:age correlations for these characteristics.

Example 6

Example 6 demonstrates how the stage of plant development and responses to environmental conditions such as light and season can be correlated to cell cycle gene expression using microarrays prepared as in Example 4. In particular, the changes in gene expression associated with wood density are examined.

Trees of three different clonally propagated Eucalyptus grandis hybrid genotypes are grown on a site with a weather station that measures daily temperatures and rainfall. During the spring and subsequent summer, genetically identical ramets of the three different genotypes are first photographed with north-south orientation marks, using photography at sufficient resolution to show bark characteristics of juvenile and mature portions of the plant, and then felled as in Example 4. The age of the trees is determined by planting records and confirmed by a count of the annual rings. In each of these trees, mature wood is defined as the outermost rings of the tree below breast height, and juvenile wood as the innermost rings of the tree above breast height. Each tree is accordingly sectored as follows:

NM—NORTHSIDE MATURE

SM—SOUTHSIDE MATURE

NT—NORTHSIDE TRANSITION

ST—SOUTHSIDE TRANSITION

NJ—NORTHSIDE JUVENILE

SJ—SOUTHSIDE JUVENILE

Tissue is harvested from the plant trunk as well as from juvenile and mature form leaves. Samples are prepared simultaneously for phenotype analysis, including plant morphology and biochemical characteristics, and gene expression analysis. The height and diameter of the tree at the point from which each sector was taken is recorded, and a soil sample from the base of the tree is taken for chemical assay. Samples prepared for gene expression analysis are weighed and placed into liquid nitrogen for subsequent preparation of RNA samples for use in the microarray experiment. The tissues are denoted as follows:

P—phloem

C—cambium

X1—expanding xylem

X2—differentiating and lignifying xylem

Thin slices in tangential and radial sections from each of the sectors of the trunk are fixed as described in Ruzin, Plant Microtechnique and Microscopy, Oxford University Press, Inc., New York, N.Y. (1999) for anatomical examination and confirmation of wood developmental stage. Microfibril angle is examined at the different developmental stages of the wood, for example juvenile, transition and mature phases of Eucalyptus grandis wood. Other characteristics examined are the ratio of fibers to vessel elements and ray tissue in each sector. Additionally, the samples are examined for characteristics that change between juvenile and mature wood and between spring wood and summer wood, such as fiber morphology, lumen size, and width of the S2 (thickest) cell wall layer. Samples are further examined for measurements of density in the fifth ring and determination of modulus of elasticity using techniques well known to those skilled in the art of wood assays. See, e.g., Wang, et al., Non-destructive Evaluations of Trees, EXPERIMENTAL TECHNIQUES, pp. 28-30 (2000).

For biochemical analysis, 50 grams from each of the harvest samples are freeze-dried and analyzed, using biochemical assays well known to those skilled in the art of plant biochemistry for quantities of simple sugars, amino acids, lipids, other extractives, lignin, and cellulose. See, e.g., Pettersen & Schwandt, J. Wood Chem. & Technol. 11:495 (1991).

In the present example, the phenotypes chosen for comparison are high density wood, average density wood, and low density wood. Nucleic acid samples are prepared as described in Example 3, from trees harvested in the spring and summer. Gene expression profiling by hybridization and data analysis is performed as described in Examples 3 and 4.

Using similar techniques and clonally propagated individuals one can examine cell cycle gene expression as it is related to other complex wood characteristics such as strength, stiffness and spirality.

Example 7

Example 7 demonstrates the ability of the oligonucleotide probes of the invention to distinguish between highly homologous members of a family of cell cycle genes. Hybridization to a particular oligonucleotide on the array identifies a unique WD40 gene that is expressed more strongly in a genotype having a higher density wood than in observed in other genotypes examined. The WD40 gene is also expressed more strongly in mature wood than in juvenile wood and more strongly in summer wood than in spring wood. This gene is not found to be expressed at high levels either in leaves or buds.

The gene expression pattern is confirmed by RT-PCR. This gene, the putative “density-related” gene, is used for in situ hybridization of fixed radial sections. The density-related WD40 gene hybridizes most strongly to the vascular cambium in regions of the stem where the xylem is comprised primarily of fibers with few vessel elements and few xylem ray cells.

These results suggest that the WD40 gene product functions in radial cell division, which occurs in the cambium and results in diameter growth, rather than in axial cell division such as may be important in the apex or leaves. Such a gene would be difficult to identify by cDNA microarrays or other traditional hybridization means because the highly conserved regions present in the gene would result in confusing it with genes encoding enzymes having similar catalytic functions, but acting in axial or radial divisions. Furthermore, from the sequence similarity-based annotation suggesting a function of this gene product in cell division and the observation of this microarray hybridization pattern, confirmed by RT-PCR and in silico hybridization, this gene product functions specifically in developing secondary xylem to guide the cell division patterns of fibers, such that higher expression of this gene results in greater fiber production relative to vessel element or ray production. The fiber content is correlated with a principal components analysis (PCA) variable that accounts for at least 10% of the variation in basic density.

Example 8

Example 8 demonstrates how the use of oligonucleotide probes of the invention can be used to identify one wood “density related” WD40 repeat protein gene and its promoter from among the family of homologous genes. Further, this example demonstrates how a promoter sequence identified using this method is used to transform other hardwood species to result in increased diameter growth rates as compared to wild-type plants of the same species.

The sequence of the WD40 gene is used to probe a Genome Walker library in order to isolate 5′ flanking sequences comprising a promoter region. The promoter region is then operably linked to a beta-glucuronidase reporter gene and cloned into a binary vector for transformation into Eucalyptus using the method described in U.S. Application Ser. No. 60/476,222. Regenerated transgenic tobacco and Eucalyptus plants are then sectioned and stained using X-gluc, demonstrating that the microarray data results in isolation of a promoter capable of highly cambial-specific expression solely in those portions of the stem that develop more fibers than vessel elements or xylem rays.

Using techniques well known to those skilled in the art of molecular biology, the promoter is then operably linked to a cell division promoting gene and this construct placed in a binary vector for transformation into hardwood plants such as Sweetgum and Populus, such that the cell division promoting gene is expressed more strongly than normally in the vascular cambium. This results in increased diameter growth rate in the transgenic hardwood plants relative to control hardwood plants.

Example 9

Example 9 demonstrates how a density related polypeptide can be linked to a tissue-preferred promoter and expressed in pine resulting in a plant with increased wood density.

A density-related polypeptide, which is more highly expressed during the early spring, is identified by the method described in Example 7. A DNA construct having the density-related polypeptide operably linked to a promoter is placed into an appropriate binary vector and transformed into pine using the method of Connett et al. (U.S. patent application Ser. Nos. 09/973,088 and 09/973,089). Pine plants are transformed as described in Connett et al., supra, and the transgenic pine plants are used to establish a forest planting. Increased density even in the spring wood (early wood) is observed in the transgenic pine plants relative to control pine plants which are not transformed with the density related DNA construct.

Example 10

Using techniques well known to those skilled in the art of molecular biology, the sequence of the putative density-related gene isolated in Example 7 is analyzed in genomic DNA isolated from alfalfa. This enables the identification of an orthologue in alfalfa whose sequence is then used to create an RNAi knockout construct. This construct is then transformed into alfalfa. See, e.g., Austin et al., Euphytica 85, 381 1995. The regenerated transgenic plants show lower fiber content and increased ray cells content in the xylem. Such properties improved digestability which results in higher growth rates in cattle fed on this alfalfa as compared to wild-type alfalfa of the same species.

Example 11

Example 11 demonstrates how gene expression analysis can be used to find gene variants which are present in mature plants having a desirable phenotype. The presence or absence of such a variant can be used to predict the phenotype of a mature plant, allowing screening of the plants at the seedling stage. Although this example employs eucalyptus, the method used herein is also useful in breeding programs for pine and other tree species.

The sequence of a putative density-related gene is used to probe genomic DNA isolated from Eucalyptus that vary in density as described in previous examples. Non-transgenically produced Eucalyptus hybrids of different wood phenotypes are examined. One hybrid exhibits high wood density and another hybrid exhibits lower wood density. A molecular marker in the 3′ portion of the coding region is found which distinguishes a high-density gene variant from a lower density gene variant.

This molecular marker enables tree breeders to assay non-transgenic Eucalyptus hybrids for likely density profiles while the trees are still at seedling stage, whereas in the absence of the marker, tree breeders must wait until the trees have grown for multiple years before density at harvest age can be reliably predicted. This enables selective outplanting of the best trees at seedling stage rather than an expensive culling operation and resultant erosion at thinning age. This molecular marker is further useful in the breeding program to determine which parents will give rise to high density outcross progeny.

Molecular markers found in the 3′ portion of the coding region of the gene that do not correspond to variants seen more frequently in higher or lower wood density non-transgenic Eucalyptus hybrid trees are also useful. These markers are found to be useful for fingerprinting different genotypes of Eucalyptus, for use in identity-tracking in the breeding program and in plantations.

Example 12

This Example describes microarrays for identifying gene expression differences that contribute to the phenotypic characteristics that are important in commercial wood, namely wood appearance, stiffness, strength, density, fiber dimensions, coarseness, cellulose and lignin content, extractives content and the like.

As in Examples 2-4, woody trees of genera that produce commercially important wood products, in this case Pinus and Eucalyptus, are felled from various sites and at various times of year for the collection and isolation of RNA from developing xylem, cambium, phloem, leaves, buds, roots, and other tissues. RNA is also isolated from seedlings of the same genera.

All contigs are compared to both the ESTs made from RNA isolated from samples containing developing wood and the sequences of the ESTs made from RNA of various tissues that do not contain developing wood. Contigs containing primarily ESTs that show more hybridization in silico to ESTs made from RNA isolated from samples containing developing wood than to ESTs made from RNA isolated from samples not containing developing wood are determined to correspond to possible novel genes particularly expressed in developing wood. These contigs are then used for BLAST searches against public domain sequences. Those contigs that hybridize with high stringency to no known genes or genes annotated as having only a “hypothetical protein” are selected for the next step. These contigs are considered putative novel genes showing wood-preferred expression.

The longest cDNA clones containing sequences hybridizing to the putative novel genes showing wood-preferred expression are selected from cDNA libraries using techniques well known to those skilled in the art of molecular biology. The cDNAs are sequenced and full-length gene-coding sequences together with untranslated flanking sequences are obtained where possible. Stretches of 45-80 nucleotides (or oligonucleotides) are selected from each of the sequences of putative novel genes showing wood-preferred expression such that each oligonucleotide probe hybridizes at high stringency to only one sequence represented in the ESTs made from RNA isolated from trees or seedlings of the same genus.

Oligomers are then chemically synthesized and placed onto a microarray slide as described in Example 3. Each oligomer corresponds to a particular sequence of a putative novel gene showing wood-preferred expression and to no other gene whose sequence is represented among the ESTs made from RNA isolated from trees or seedlings of the same genus.

Sample preparation and hybridization are carried out as in Example 4. The technique used in this example is more effective than use of a microarray using cDNA probes because the presence of a signal represents significant evidence of the expression of a particular gene, rather than of any of a number of genes that may contain similarities to the cDNA due to conserved functional domains or common evolutionary history. Thus, it is possible to differentiate homologous genes, such as those in the same family, but which may have different functions in phenotype determination.

Thus hybridization data, gained using the method of Example 4, enable the user to identify which of the putative novel genes actually has a pattern of coordinate expression with known genes, a pattern of expression consistent with a particular developmental role, and/or a pattern of expression that suggests that the gene has a promoter that drives expression in a valuable way.

The hybridization data thus using this method can be used, for example, to identify a putative novel gene that shows an expression pattern particular to the tracheids with the lowest cellulose microfibril angle in developing spring wood (early wood). The promoter of this gene can also be isolated as in Example 8, and operably linked to a gene that has been shown as in Example 9 to be associated with late wood (summer wood). Transgenic pine plants containing this construct are generated using the methods of Example 9, and the early wood of these plants is then shown to display several characteristics of late wood, such as higher microfibril angle, higher density, smaller average lumen size, etc.

Example 13

Example 13 demonstrates the use of a cambium-specific promoter functionally linked to a cell cycle gene for increased plant biomass.

Cambium-specific cell cycle transcripts are identified via array analyses of different secondary vasculature layers as described in Example 4. Candidate promoters linked to the genes corresponding to these transcripts are cloned from pine genomic DNA using, e.g., the BD Clontech GenomeWalker kit and tested in transgenic tobacco via a reporter assay(s) for cambium specificity/preference. The cambium-specific promoter overexpressing a cell cycle gene involved in secondary xylem cell division is used to increased wood biomass. A tandem cambium-specific promoter is constructed driving the cell cycle ORF. Boosted transcript levels of the candidate cell cycle gene result in an increased xylem biomass phenotype.

Example 14

Isolation and Characterization of cDNA Clones from Eucalyptus Grandis

Eucalyptus grandis cDNA expression libraries were prepared from mature shoot buds, early wood phloem, floral tissue, leaf tissue (two independent libraries), feeder roots, structural roots, xylem or early wood xylem and were constructed and screened as follows.

Total RNA was extracted from the plant tissue using the protocol of Chang et al. (Plant Molecular Biology Reporter 11:113-116 (1993). mRNA was isolated from the total RNA preparation using either a Poly(A) Quik mRNA Isolation Kit (Stratagene, La Jolla, Calif.) or Dynal Beads Oligo (dT)25 (Dynal, Skogen, Norway). A cDNA expression library was constructed from the purified mRNA by reverse transcriptase synthesis followed by insertion of the resulting cDNA clones in Lambda ZAP using a ZAP Express cDNA Synthesis Kit (Stratagene), according to the manufacturer's protocol. The resulting cDNAs were packaged using a Gigapack II Packaging Extract (Stratagene) using an aliquot (1-5 αl) from the 5 μl ligation reaction dependent upon the library. Mass excision of the library was done using XL1-Blue MRF′ cells and XLOLR cells (Stratagene) with ExAssist helper phage (Stratagene). The excised phagemids were diluted with NZY broth (Gibco BRL, Gaithersburg, Md.) and plated out onto LB-kanamycin agar plates containing X-gal and isopropylthio-beta-galactoside (IPTG).

Of the colonies plated and selected for DNA miniprep, 99% contained an insert suitable for sequencing. Positive colonies were cultured in NZY broth with kanamycin and cDNA was purified by means of alkaline lysis and polyethylene glycol (PEG) precipitation. Agarose gel at 1% was used to screen sequencing templates for chromosomal contamination. Dye primer sequences were prepared using a Turbo Catalyst 800 machine (Perkin Elmer/Applied Biosystems Division, Foster City, Calif.) according to the manufacturer's protocol.

DNA sequence for positive clones was obtained using a Perkin Elmer/Applied Biosystems Division Prism 377 sequencer. cDNA clones were sequenced first from the 5′ end and, in some cases, also from the 3′ end. For some clones, internal sequence was obtained using either Exonuclease III deletion analysis, yielding a library of differentially sized subclones in pBK-CMV, or by direct sequencing using gene-specific primers designed to identified regions of the gene of interest.

The determined cDNA sequences were compared with known sequences in the EMBL database using the computer algorithms FASTA and/or BLASTN. Multiple alignments of redundant sequences were used to build reliable consensus sequences. Based on similarity to known sequences from other plant species, the isolated polynucleotide sequences were identified as encoding transcription factors, as detailed herein. The predicted polypeptide sequences corresponding to the polynucleotide sequences are also depicted therein.

Example 15

Isolation and Characterization of cDNA Clones from Pinus Radiata

Pinus radiata cDNA expression libraries (prepared from either shoot bud tissue, suspension cultured cells, early wood phloem (two independent libraries), fascicle meristem tissue, male strobilus, root (unknown lineage), feeder roots, structural roots, female strobilus, cone primordia, female receptive cones and xylem (two independent libraries) were constructed and screened as described above in Example 14.

DNA sequence for positive clones was obtained using forward and reverse primers on a Perkin Elmer/Applied Biosystems Division Prism 377 sequencer and the determined sequences were compared to known sequences in the database as described above.

Based on similarity to known sequences from other plant species, the isolated polynucleotide sequences were identified as encoding transcription factors, as detailed herein. The predicted polypeptide sequences corresponding to the polynucleotide sequences are also depicted therein.

Example 16

5′ RACE Isolation

To identify additional sequence 5′ or 3′ of a partial cDNA sequence in a cDNA library, 5′ and 3′ rapid amplification of cDNA ends (RACE) was performed using the SMART RACE cDNA amplification kit (Clontech Laboratories, Palo Alto, Calif.). Generally, the method entailed first isolating poly(A) mRNA, performing first and second strand cDNA synthesis to generate double stranded cDNA, blunting cDNA ends, and then ligating of the SMART RACE. Adaptor to the cDNA to form a library of adaptor-ligated ds cDNA. Gene-specific primers were designed to be used along with adaptor specific primers for both 5′ and 3′ RACE reactions. Using 5′ and 3′ RACE reactions, 5′ and 3′ RACE fragments were obtained, sequenced, and cloned. The process may be repeated until 5′ and 3′ ends of the full-length gene were identified. A full-length cDNA may generated by PCR using primers specific to 5′ and 3′ ends of the gene by end-to-end PCR.

For example, to amplify the missing 5′ region of a gene from first-strand cDNA, a primer was designed 5′→3′ from the opposite strand of the template sequence, and from the region between ˜100-200 bp of the template sequence. A successful amplification should give an overlap of ˜100 bp of DNA sequence between the 5′ end of the template and PCR product.

RNA was extracted from four pine tissues, namely seedling, xylem, phloem and structural root using the Concert Reagent Protocol (Invitrogen, Carlsbad, Calif.) and standard isolation and extraction procedures. The resulting RNA was then treated with DNase, using 10 U/ul DNase I (Roche Diagnostics, Basel, Switzerland). For 100 μg of RNA, 9 μl 10× DNase buffer (Invitrogen, Carlsbad, Calif.), 10 μl of Roche DNase I and 90 μl of Rnase-free water was used. The RNA was then incubated at room temperature for 15 minutes and 1/10 volume 25 mM EDTA is added. A RNeasy mini kit (Qiagen, Venlo, The Netherlands) was used for RNA clean up according to manufacturer's protocol.

To synthesize cDNA, the extracted RNA from xylem, phloem, seedling and root was used and the SMART RACE cDNA amplification kit (Clontech Laboratories Inc, Palo Alto, Calif.) was followed according to manufacturer's protocol. For the RACE PCR, the cDNA from the four tissue types was combined. The master mix for PCR was created by combining equal volumes of cDNA from xylem, phloem, root and seedling tissues. PCR reactions were performed in 96 well PCR plates, with 1 μl of primer from primer dilution plate (10 mM) to corresponding well positions. 49 μl of master mix is aliquoted into the PCR plate with primers. Thermal cycling commenced on a GeneAmp 9700 (Applied Biosystems, Foster City, Calif.) at the following parameters:

94° C. (5 sec),

72° C. (3 min), 5 cycles;

94° C. (5 sec),

70° C. (10 sec),

72° C. (3 min), 5 cycles;

94° C. (5 sec),

68° C. (10 sec),

72° C. (3 min), 25 cycles.

cDNA was separated on an agarose gel following standard procedures. Gel fragments were excised and eluted from the gel by using the Qiagen 96-well Gel Elution kit, following the manufacturer's instructions.

PCR products were ligated into pGEMTeasy (Promega, Madison, Wis.) in a 96 well plate overnight according to the following specifications: 60-80 ng of DNA, 5 μl 2× rapid ligation buffer, 0.5 μl pGEMT easy vector, 0.1 μl DNA ligase, filled to 10 μl with water, and incubated overnight.

Each clone was transformed into E. coli following standard procedures and DNA was extracted from 12 clones picked by following standard protocols. DNA extraction and the DNA quality was verified on an 1% agarose gel. The presence of the correct size insert in each of the clones was determined by restriction digests, using the restriction endonuclease EcoRI, and gel electrophoresis, following standard laboratory procedures.

Example 17

Curation of an EST Sequence.

During the production of cDNA libraries, the original transcripts or their DNA counterparts may have features that prevent them from coding for functional proteins. There may be insertions, deletions, base substitutions, or unspliced or improperly spliced introns. If such features exist, it is often possible to identify them so that they can be changed. Similar curation can be performed on any other sequences that have homology to sequences in the public databases.

After determination of the DNA sequence, BLAST analysis shows that it is related to an Arabidopsis gene on the publicly available Arabidopsis genome sequence). However, instead of coding for an approximately 240 amino acid polypeptide, the consensus being curated is predicted to code for a product of only 157 amino acid residues, suggesting an error in the DNA sequence. To identify where the genuine coding region might be, the DNA sequence to the end of each EST is translated in each of the three reading frames and the predicted sequences are aligned with the Arabidopsis gene's amino acid sequence. It is found that the DNA segment in one portion of the EST codes for a sequence with similarity to the carboxyl terminus of the Arabidopsis gene. Therefore, it appears that an unspliced intron is present in the EST.

Unspliced introns are a relatively minor issue with regard to use of a cloned sequence for overexpression of the gene of interest. The RNA resulting from transcription of the cDNA can be expected to undergo normal processing to remove the intron. Antisense and RNAi constructs are also expected to function to suppress the gene of interest. On other occasions, it may be desirable to identify the precise limits of the intron so that it can be removed. When the sequence in question has a published sequence that is highly similar, it may be possible to find the intron by aligning the two sequences and identifying the locations where the sequence identity falls off, aided by the knowledge that introns start with the sequence GT and end with the sequence AG.

When there is some doubt about the site of the intron because highly similar sequences are not available, the intron location can be verified experimentally. For example, DNA oligomers can be synthesized flanking the region where the suspected intron is located. RNA from the source species, either Pinus or Eucalyptus, is isolated and used as a template to make cDNA using reverse transcriptase. The selected primers are then used in a PCR reaction to amplify the correctly spliced DNA segment (predicted size of approximately 350 by smaller than the corresponding segment of the original consensus) from the population of cDNAs. The amplified segment is then subjected to sequence analysis and compared to the consensus sequence to identify the differences.

The same procedure can be used when an alternate splicing event (partial intron remaining, or partial loss of an exon) is suspected. When an EST has a small change, such as insertion or deletion of a small number of bases, computer analysis of the EST sequence can still indicate its location when a translation product of the wrong size is predicted or if there is an obvious frameshift. Verification of the true sequence is done by synthesis of primers, production of new cDNA, and PCR amplification as described above.

Example 18

Transformation of Populus deltoides with constructs containing cell cycle genes.

Constructs made as described in the preceding example and shown in Table 2 below were each inoculated into Agrobacterium cultures by standard techniques.

Table 2 identifies plasmid(s), genes, and Genesis ID numbers for constructions described in Example 17.

TABLE 2
Plasmid(s) Gene Genesis ID
pGrw14 Cyclin A prga001823
pGrw15 Cyclin A prpe001264
pGrw16 Cyclin D prxa004540
pGrw18 Cyclin D prxl006271
PGrw19 Cyclin D prpb019661
PGrw20 WEE1-like protein prrd041233

Populus deltoides stock plant cultures were maintained on DKW medium (Driver and Kuniyuki, 1984, McGranahan et al. 1987, available commercially from Sigma/Aldrich) with 2.5 uM zeatin in a growth room with a 16 h photoperiod. For transformation, petioles were excised aseptically using a sharp scalpel blade from the stock plants, cut into 4-6 mm lengths, placed on DKW medium with 1 ug/ml BAP and 1 ug/ml NAA immediately after harvest, and incubated in a dark growth chamber (28 degrees) for 24 hours.

Agrobacterium cultures containing the desired constructs were grown to log phase, indicated by an OD600 between 0.8-1.0 A, then pelleted and resuspended in an equal volume of Agrobacterium Induction Medium (AIM), which contains Woody Plant Medium salts (Lloyd, G., and McCown, B., 1981. Woody plant medium. Proc. Intern. Plant Prop. Soc. 30:421, available commercially from Sigma/Aldrich), 5 g/L glucose and 0.6 g/L MES at pH 5.8, with the addition of 1 ul of a 100 mM stock solution of acetosyringone per ml of AIM. The pellet was resuspended by vortexing. The bacterial cells were incubated for an hour in this medium at 28 degrees C. in an environmental chamber, shaking at 100 rpm.

After the induction period, Populus deltoides explants were exposed to the Agrobacterium mixture for 15 minutes. The explants were then lightly blotted on sterile paper towels, replaced onto the same plant medium and cultured in the dark at 18-20 degrees C. After a three-day co-cultivation period, the explants were transferred to DKW medium in which the NAA concentration was reduced to 0.1 ug/ml and to which was added 400 mg/L timentin to eradicate the Agrobacterium.

After 4 days on eradication medium, explants were transferred to small magenta boxes containing the same medium supplemented with timentin (400 mg/L) as well as the selection agent geneticin (50 mg/L). Explants were transferred every two weeks to fresh selection medium. Calli that grow in the presence of selection were isolated and sub-cultured to fresh selection medium every three weeks. Calli were observed for the production of adventitious shoots.

Adventitious shoots were normally observed within two months from the initiation of transformation. These shoot clusters were transferred to DKW medium to which no NAA was added, and in which the BAP concentration was reduced to 0.5 ug.ml, for shoot elongation, typically for about 14 weeks. Elongated shoots were excised and transferred to BTM medium (Chalupa, Communicationes Instituti Forestalis Checosloveniae 13:7-39, 1983, available commercially from Sigma/Aldrich) at pH5.8, containing 20 g/l sucrose and 5 g/l activated charcoal. See Table 3 below.

TABLE 3
Rooting medium for Populus deltoids.
BTM-1 Media Components mg/L
NH4NO3 412
KNO3 475
Ca(NO3)2•4H2O 640
CaCl2•2H2O  440*
MgSO4•7H2O 370
KH2PO4 170
MnSO4•H2O    2.3
ZnSO4•7H2O    8.6
CuSO4•5H2O    0.25
CoCl2•6H2O    0.02
KI    0.15
H3BO3    6.2
Na2MoO4•2H2O    0.25
FeSO4•7H2O   27.8
Na2EDTA•2H2O   37.3
Myo-inositol 100
Nicotinic acid    0.5
Pyridoxine HCl    0.5
Thiamine HCl  1
Glycine  2
Sucrose 20000 
Activated Carbon 5000 

After development of roots, typically four weeks, transgenic plants were propagated in the greenhouse by rooted cutting methods, or in vitro through axillary shoot induction for four weeks on DKW medium containing 11.4 uM zeatin, after which the multiplied shoots were separated and transferred to root induction medium. Rooted plants were transferred to soil for evaluation of growth in glasshouse and field conditions.

Example 19

Production of disproportionately large leaves mediated by ectopic expression of certain cyclin D genes

Approximately 100 explants of Populus deltoides per construct were transformed with pGRW16 and pGRW19, which contain genes that are normally show preferred expression in the vasculature, driven by a constitutive promoter (the Pinus radiata superubiquitin promoter). Upon regeneration, many of the ramets of many of the translines were observed to have disproportionately large leaves relative to control plants. The leaves were both longer and broader than those of control plants.

Disproportionately large leaves could be a very useful early indicator of growth potential large leaf size and thus high growth potential. Lage leaf size can be a function of either increased numbers of leaf cells or increased leaf cell size or both.

Example 20

Production of unusual vascular development mediated by ectopic expression of a cyclin D gene.

Approximately 100 explants of Populus deltoides per construct were transformed with pGRW18. Multiple transgenic lines regenerated from this experiment showed a very unique pleiotropic phenotype. Leaves of these transgenic lines symmetrically folded on both sides of the midrib down the entire length of the leaf. Many petioles of these lines spiraled, and in many cases turned 360 degrees, in a right-handed fashion towards the leaf. The stem showed some thickening and slight bending near the middle.

One ramet of the transgenic line TDL002534 showing these phenotypes was sacrificed to investigate these aberrancies at the tissue level. Transverse sections of a curling petiole stained with toluidine blue revealed retardation of vascular development, but the presence of additional vascular cylinders developing as indicated by the black arrows. The xylem and phloem within the vascular cylinders of the curling petiole appeared to be developmentally similar and spatially oriented correctly. Longitudinal sections of straight and curled petioles may offer an explanation for the spiraling phenomenon. Curled petioles showed more elongated cells on the outside turn of the curl and more compressed cells on the opposite side of the petiole.

Perhaps the most striking phenotype was identified in the leaves. As with the petioles, aberrant vascular development was noted, comprising additional forming vascular cylinders lateral to the larger midrib. In some sections almost fully-formed veins could be seen immediately adjacent to the midrib. In all instances where the folding phenotype was noted, this type of leaf configuration was associated with the phenotype.

The development of additional vascular cylinders in the space where normally a small number of vascular bundles or a single midrib are seen is indicative of unusual cell division activity at the level of early vascular development. Thus, this gene expressed under the control of a vascular-preferred promoter rather than a constitutive promoter could have utility in increasing cell division in later vascular development, creating additional wood.

Example 21

This example illustrates how polynucleotides important for wood development in P. radiata can be determined and how oligonucleotides which uniquely bind to those genes can be designed and synthesized for use on a microarray.

Open pollinated trees of approximately 16 years of age are selected from plantation-grown sites, in the United States for loblolly pine, and in New Zealand for radiata pine. Trees are felled during the spring and summer seasons to compare the expression of genes associated with these different developmental stages of wood formation. Trees are felled individually and trunk sections are removed from the bottom area approximately one to two meters from the base and within one to two meters below the live crown. The section removed from the basal end of the trunk contains mature wood. The section removed from below the live crown contains juvenile wood. Samples collected during the spring season are termed earlywood or springwood, while samples collected during the summer season are considered latewood or summerwood (Larson et al., Gen. Tech. Rep. FPL-GTR-129. Madison, Wis.: U.S. Department of Agriculture, Forest Service, Forest Products Laboratory. p. 42).

Tissues are isolated from the trunk sections such that phloem, cambium, developing xylem, and maturing xylem are removed. These tissues are collected only from the current year's growth ring. Upon tissue removal in each case, the material is immediately plunged into liquid nitrogen to preserve the nucleic acids and other components. The bark is peeled from the section and phloem tissue removed from the inner face of the bark by scraping with a razor blade. Cambium tissue is isolated from the outer face of the peeled section by gentle scraping of the surface. Developing xylem and lignifying xylem are isolated by sequentially performing more vigorous scraping of the remaining tissue. Tissues are transferred from liquid nitrogen into containers for long term storage at −70 until RNA extraction and subsequent analysis is performed.

Example 22

This example illustrates a procedure for RNA extraction and purification, which is particularly useful for RNA obtained from conifer needle, xylem, cambium, and phloem.

Tissue is obtained from conifer needle, xylem, cambium or phloem. The tissue is frozen in liquid nitrogen and ground. The total RNA is extracted using Concert Plant RNA reagent (Invitrogen). The resulting RNA sample is extracted into phenol:chloroform and treated with DNase. The RNA is then incubated at 65° C. for 2 minutes followed by centrifugation at 4° C. for 30 minutes. Following centrifugation, the RNA is extracted into phenol at least 10 times to remove contaminants.

The RNA is further cleaned using RNeasy columns (Qiagen). The purified RNA is quantified using RiboGreen reagent (Molecular Probes) and purity assessed by gel electrophoresis.

RNA is then amplified using MessageAmp (Ambion). Aminoallyl-UTP and free UTP are added to the in vitro transcription of the purified RNA at a ratio of 4:1 aminoallyl-UTP-to-UTP. The aminoallyl-UTP is incorporated into the new RNA strand as it is transcribed. The amino-allyl group is then reacted with Cy dyes to attach the colorimetric label to the resulting amplified RNA using the Amersham procedure modified for use with RNA. Unincorporated dye is removed by ethanol precipitation. The labeled RNA is quantified spectrophotometrically (NanoDrop). The labeled RNA is fragmented by heating to 95° C. as described in Hughes et al., Nature Biotechnol. 19:342 (2001).

Example 23

This Example illustrates how genes important for wood development in P. radiata can be determined and how oligonucleotides which uniquely bind to those genes can be designed and synthesized for use on a microarray.

Pine trees of the species P. radiata are grown under natural light conditions. Tissue samples are prepared as described in, e.g., Sterky et al., Proc. Nat'l Acad. Sci. 95:13330 (1998). Specifically, tissue samples are collected from woody trees having a height of 5 meters. Tissue samples of the woody trees are prepared by taking tangential sections through the cambial region of the stem. The stems are sectioned horizontally into sections ranging from juvenile (top) to mature (bottom). The stem sections separated by stage of development are further separated into 5 layers by peeling into sections of phloem, differentiating phloem, cambium, differentiating xylem, developing xylem, and mature xylem. Tissue samples, including leaves, buds, shoots, and roots are also prepared from seedlings of the species P. radiata.

RNA is isolated and ESTs generated as described in the Example above or Sterky et al., supra. The nucleic acid sequences of ESTs derived from samples containing developing wood are compared with nucleic acid sequences of genes known to be involved in polysaccharide synthesis. ESTs from samples that do not contain developing wood are also compared with sequences of genes known to be involved in the plant cell cycle. An in silico hybridization analysis is performed using BLAST (NCBI) as follows.

Example 24

Eucalyptus in Silico Data

In silico gene expression can be used to determine the membership of the consensi EST libraries. For each library, a consensus is determined from the number of ESTs in any tissue class divided by the total number of ESTs in a class multiplied by 1000. These values provide a normalized value that is not biased by the extent of sequencing from a library. Several libraries were sampled for a consensus value, including reproductive, bud reproductive, bud vegetative, fruit, leaf, phloem, cambium, xylem, root, stem, sap vegetative, whole plant libraries.

As shown below, a number of the inventive sequences exhibit vascular-preferred expression (more than 50% of the hits by these sequences if the databases were searched at random would be in libraries made from developing vascular tissue) and thus are likely to be involved in wood-related developmental processes. The data are shown in Table 12.

Example 25

Pinus in Silico Data

In silico gene expression can be used to determine the membership of the consensi EST libraries. For each library, a consensus is determined from the number of ESTs in any tissue class divided by the total number of ESTs in a class multiplied by 1000. These values provide a normalized value that is not biased by the extent of sequencing from a library. Several libraries were sampled for a consensus value, including needles, phloem, cambium, xylem, root, stem and, whole plant libraries.

As shown below, a number of the inventive sequences exhibit vascular-preferred expression (more than 50% of the hits by these sequences if the databases were searched at random would be in libraries made from developing vascular tissue) and thus are likely to be involved in wood-related developmental processes. The data are shown in Table 13.

Example 26

Sequences that show hybridization in silico to ESTs made from samples containing developing wood, but that do not hybridize to ESTs from samples not containing developing wood are selected for further examination.

cDNA clones containing sequences that hybridize to the genes showing wood-preferred expression are selected from cDNA libraries using techniques well known in the art of molecular biology. Using the sequence information, oligonucleotides are designed such that each oligonucleotide is specific for only one cDNA sequence in the library. The oligonucleotide sequences are provided in Table 14. 60-mer oligonucleotide probes are designed using the method of Li and Stormo, supra or using software such as ArrayDesigner, GeneScan, and ProbeSelect.

The oligonucleotides are then synthesized in situ described in Hughes et al., Nature Biotechnol. 19:324 (2002) or as described in Kane et al., Nucleic Acids Res. 28:4552 (2000) and affixed to an activated glass slide (Sigma-Genosis, The Woodlands, Tex.) using a 5′ amino linker. The position of each oligonucleotide on the slide is known.

Example 27

This example illustrates how to detect expression of Pinus radiata genes of the instant application which are important in wood formation using an oligonucleotide microarray prepared as described above. This is an example of a balanced incomplete block designed experiment carried out using aRNA samples prepared from mature-phase phloem (P), cambium (C), expanding xylem found in a layer below the cambium (X1) and differentiating, lignifying xylem cells found deeper in the same growth ring (X2). In this example, cell cycle gene expression is compared among the four samples, namely P, C, X1, and X2.

In the summer, plants of the species Pinus radiata are felled and the bark of the main stem is immediately pulled gently away to reveal the phloem and xylem. The phloem and xylem are then peeled with a scalpel into separate containers of liquid nitrogen. Needles (leaves) and buds from the trees are also harvested with a scalpel into separate containers of liquid nitrogen. RNA is subsequently isolated from the frozen tissue samples as described in Example 1. Equal microgram quantities of total RNA are purified from each sample using RNeasy Mini columns (Qiagen, Valencia, Calif.) according to the manufacturers instructions.

Amplification reactions are carried out for each of the P, C, X1, and X2 tissue samples. Amplification reactions are performed using Ambion's MessageAmp kit, a T7-based amplification procedure, following the manufacturer's instructions, except that labeled aaUTP is added to the reagent mix during in the amplification step. aaUTP is incorporated into the resulting antisense RNA formed during this step. CyDye fluorescent labels are coupled to the aaUTPs in a non-enzymatic reaction as described in Example 1. Labeled amplified antisense RNAs are precipitated and washed, and then assayed for purity using a NanoDrop spectrophotometer. These labeled antisense RNAs, corresponding to the RNA isolated from the P, C, X1, and X2 tissue samples, constitute the sample nucleic acids, which are referred to as the P, C, X1, and X2 samples.

Normalization control samples of known nucleic acids are added to each sample in a dilution series of 500, 200, 100, 50, 25 and 10 pg/μl for quantitation of the signals. Positive controls corresponding to specific genes showing expression in all tissues of pine, such as housekeeping genes, are also added to the plant sample.

Each of four microarray slides is incubated with 125 μL of a P, C, X1 or X2 sample under a coverslip at 42° C. for 16-18 hours. The arrays are washed in 1×SSC, 0.1% SDS for 10 minutes and then in 0.1×SSC, 0.1% SDS for 10 minutes and the allowed to dry.

The array slides are scanned using an Axon laser scanner and analyzed using GenePix Pro software. Data from the microarray slides are subjected to microarray data analysis using GenStat SAS or Spotfire software. Outliers are removed and ratiometric data for each of the datasets are normalized using a global normalization which employs a cubic spline fit applied to correct for differential dye bias and spatial effects. A second transformation is performed to fit control signal ratios to a mean log2=0 (i.e. 1:1 ratio). Normalized data are then subjected to a variance analysis.

Mean signal intensity for each signal at any given position on the microarray slide is determined for each of three of P, C, X1, and X2 sample microarray slides. This mean signal/probe position is compared to the signal at the same position on sample slide which was not used for calculating the mean. For example, a mean signal at a given position is determined for P, C, and X1 and the signal at that position in the X2 microarray slide is compared to the P, C, and X1 mean signal value.

Table 5 shows genes having greater than doubled signal with any one sample as compared to the mean signal of the other three samples.

TABLE 5
Gene PvCX12 PvX12 CvX12
WD40 repeat protein A −1.24 −0.88 −1.07
CDC2 −1.09 −0.78 −0.92
CYCLIN −1.08 −1 −0.26
WD-40 repeat protein B −1.01 −0.87 −0.42
CDC2 −0.83 −0.49 −1.01
P = Phloem
C = Cambium
X1 = xylem layer-1
X2 = xylem layer-2
PvCX12 = Ratio of the signal for Phloem target versus mean signal for Cambium, Xylem1, and Xylem2 targets

The data shows that WD40 repeat protein A encodes a WD40 repeat protein is less highly expressed in cambium than in developing xylem, while WD40 repeat protein B encodes a WD40 repeat protein that is more highly expressed in phloem than in the other tissues.

Signal data are then verified with RT-PCR to confirm gene expression in the target tissue of the genes corresponding to the unique oligonucleotides in the probe.

Example 28

This example illustrates how RNAs of tissues from multiple pine species, in this case both P. radiata and loblolly pine P. taeda trees, are selected for analysis of the pattern of gene expression associated with wood development in the juvenile wood and mature wood forming sections of the trees using the microarrays derived from P. radiata cDNA sequences described in Example 4.

Open pollinated trees of approximately 16 years of age are selected from plantation-grown sites, in the United States for loblolly pine, and in New Zealand for radiata pine. Trees are felled during the spring and summer seasons to compare the expression of genes associated with these different developmental stages of wood formation. Trees are felled individually and trunk sections are removed from the bottom area approximately one to two meters from the base and within one to two meters below the live crown. The section removed from the basal end of the trunk contains mature wood. The section removed from below the live crown contains juvenile wood. Samples collected during the spring season are termed earlywood or springwood, while samples collected during the summer season are considered latewood or summerwood. Larson et al., Gen. Tech. Rep. FPL-GTR-129. Madison, Wis.: U.S. Department of Agriculture, Forest Service, Forest Products Laboratory. p. 42.

Tissues are isolated from the trunk sections such that phloem, cambium, developing xylem, and maturing xylem are removed. These tissues are collected only from the current year's growth ring. Upon tissue removal in each case, the material is immediately plunged into liquid nitrogen to preserve the nucleic acids and other components. The bark is peeled from the section and phloem tissue removed from the inner face of the bark by scraping with a razor blade. Cambium tissue is isolated from the outer face of the peeled section by gentle scraping of the surface. Developing xylem and lignifying xylem are isolated by sequentially performing more vigorous scraping of the remaining tissue. Tissues are transferred from liquid nitrogen into containers for long term storage at −70° C. until RNA extraction and subsequent analysis is performed.

Example 29

This example illustrates procedures alternative to those used in the example above for RNA extraction and purification, particularly useful for RNA obtained from a variety of tissues of woody plants, and a procedure for hybridization and data analysis using the arrays described in Example 4.

RNA is isolated according to the protocol of Chang et al., Plant Mol. Biol. Rep. 11:113. DNA is removed using DNase I (Invitrogen, Carlsbad, Calif.) according to the manufacturer's recommendations. The integrity of the RNA samples is determined using the Agilent 2100 Bioanalyzer (Agilent Technologies, USA).

10 μg of total RNA from each tissue is reverse transcribed into cDNA using known methods.

In the case of Pinus radiata phloem tissue, it can be difficult to extract sufficient amounts of total RNA for normal labelling procedures. Total RNA is extracted and treated as previously described and 100 ng of total RNA is amplified using the Ovation™ Nanosample RNA Amplification system from NuGEN™ (NuGEN, CA, USA). Similar amplification kits such as those manufactured by Ambion may alternatively be used. The amplified RNA is reverse transcribed into cDNA and labelled as described above.

Hybridization and stringency washes are performed using the protocol as described in the US Patent Application for “Methods and Kits for Labeling and Hybridizing cDNA for Microarray Analysis” (supra) at 42 C. The arrays (slides) are scanned using a ScanArray 4000 Microarray Analysis System (GSI Lumonics, Ottawa, ON, Canada). Raw, non-normalized intensity values are generated using QUANTARRAY software (GSI Lumonics, Ottawa, ON, Canada).

A fully balanced, incomplete block experimental design (Kerr and Churchill, Gen. Res. 123:123, 2001) is used in order to design an array experiment that would allow maximum statistical inferences from analyzed data.

Gene expression data is analyzed using the SAS® Microarray Solution software package (The SAS Institute, Cary, N.C., USA). Resulting data was then visualized using JMP® (The SAS Institute, Cary, N.C., USA).

Analysis done for this experiment is an ANOVA approach with mixed model specification (Wolfinger et al., J. Comp. Biol. 8:625-637). Two steps of linear mixed models are applied. The first one, normalization model, is applied for global normalization at slide-level. The second one, gene model, is applied for doing rigorous statistical inference on each gene. Both models are stated in Models (1) and (2).


log2(Yijkls)=θij+Dk+Sl+DSklijkls   (1)


Rijkls(g)ij(g)+Dk(g)+Sl(g)+DSkl(g)+SSls(g)ijkls(g)   (2)

Yijkls represents the intensity of the sth spot in the 1th slide with the kth dye applying the jth treatment for the ith cell line. θij, Dk, Sl, and DSkl represent the mean effect of the jth treatment in the ith cell line, the kth dye effect, the lth slide random effect, and the random interaction effect of the kth dye in the lth slide. ωijkls is the stochastic error term. represent the similar roles as θij, Dk, Sl, and DSkl except they are specific for the gth gene. Rijkls(g) represents the residual of the gth gene from model (1). μij(g), Dk(g), Sl(g), and DSkl(g) represent the similar roles as θij, Dk, Sl, and DSkl except they are specific for the gth gene. SSls(g) represent the spot by slide random effect for the gth gene. εijkls(g) represent the stochastic error term. All random terms are assumed to be normal distributed and mutually independent within each model.

According to the analysis described above, certain cDNAs, some of which are shown in Table 6 below, are found to be differentially expressed.

TABLE 6
Gene corresponding
to SEQ ID Oligo ID Gene_Family Expression
162 Pra_000171_O_4 Peptidylprolyl isomerase steady state RNA higher
in xylem than cambium
164 Pra_001480_O_3 Peptidylprolyl isomerase steady state RNA lower
in xylem than cambium
control Pra_000218_O_2 RIBONUCLEOSIDE-DIPHOSPHATE steady state RNA lower
REDUCTASE LARGE CHAIN (EC1.17.4.1). in xylem than cambium
control Pra_000193_O_2 PUTATIVE SURFACE PROTEIN. steady state RNA lower
in xylem than cambium

The involvement of these specific genes in wood development is inferred through the association of the up-regulation or down-regulation of genes to the particular stages of wood development. Both the spatial continuum of wood development across a section (phloem, cambium, developing xylem, maturing xylem) at a particular season and tree trunk position and the relationships of season and tree trunk position are considered when making associations of gene expression to the relevance in wood development.

Example 30

This example demonstrates how one can correlate polysaccharide gene expression with agronomically important wood phenotypes such as density, stiffness, strength, distance between branches, and spiral grain.

Mature clonally propagated pine trees are selected from among the progeny of known parent trees for superior growth characteristics and resistance to important fungal diseases. The bark is removed from a tangential section and the trees are examined for average wood density in the fifth annual ring at breast height, stiffness and strength of the wood, and spiral grain. The trees are also characterized by their height, mean distance between major branches, crown size, and forking.

To obtain seedling families that are segregating for major genes that affect density, stiffness, strength, distance between branches, spiral grain and other characteristics that may be linked to any of the genes affecting these characteristics, trees lacking common parents are chosen for specific crosses on the criterion that they exhibit the widest variation from each other with respect to the density, stiffness, strength, distance between branches, and spiral grain criteria. Thus, pollen from a tree exhibiting high density, low mean distance between major branches, and high spiral grain is used to pollinate cones from the unrelated plus tree among the selections exhibiting the lowest density, highest mean distance between major branches, and lowest spiral grain. It is useful to note that “plus trees” are crossed such that pollen from a plus tree exhibiting high density are used to pollinate developing cones from another plus tree exhibiting high density, for example, and pollen from a tree exhibiting low mean distance between major branches would be used to pollinate developing cones from another plus tree exhibiting low mean distance between major branches.

Seeds are collected from these controlled pollinations and grown such that the parental identity is maintained for each seed and used for vegetative propagation such that each genotype is represented by multiple ramets. Vegetative propagation is accomplished using micropropagation, hedging, or fascicle cuttings. Some ramets of each genotype are stored while vegetative propagules of each genotype are grown to sufficient size for establishment of a field planting. The genotypes are arrayed in a replicated design and grown under field conditions where the daily temperature and rainfall are measured and recorded.

The trees are measured at various ages to determine the expression and segregation of density, stiffness, strength, distance between branches, spiral grain, and any other observable characteristics that may be linked to any of the genes affecting these characteristics. Samples are harvested for characterization of cellulose content, lignin content, cellulose microfibril angle, density, strength, stiffness, tracheid morphology, ring width, and the like. RNA is then collected from replicated samples of trees showing divergent stiffness and density, or other characteristics, from genotypes that are otherwise as similar as possible in growth habit, in spring and fall so that early and late wood development is assayed. These samples are examined for gene expression similarly as described in above examples.

TABLE 7
Concensus ID Information.
Patent app SEQ ID Gene Family Consensus_ID Expression
control Ribonucleoside- pinusRadiata_000218 up in early spring xylem
diphosphate reductase vs late summer xylem
Cell Cycle 168 Peptidylprolyl pinusRadiata_001692 up in juvenile
isomerase developing wood vs
mature developing xylem
control Nitrite transporter pinusRadiata_016801 up mature developing xylem
vs juvenile cambium

Ramets of each genotype are compared to ramets of the same genotype at different ages to establish age:age correlations for these characteristics.

Example 31

Example 8 demonstrates how responses to environmental conditions such as light and season alter plant phenotype and can be correlated to polysaccharide synthesis gene expression using microarrays. In particular, the changes in gene expression associated with wood density are examined.

Trees of three different clonally propagated E. grandis hybrid genotypes are grown on a site with a weather station that measures daily temperatures and rainfall. During the spring and subsequent summer, genetically identical ramets of the three different genotypes are first photographed with north-south orientation marks, using photography at sufficient resolution to show bark characteristics of juvenile and mature portions of the plant, and then felled. The age of the trees is determined by planting records and confirmed by a count of the annual rings. In each of these trees, mature wood is defined as the outermost rings of the tree below breast height, and juvenile wood as the innermost rings of the tree above breast height. Each tree is accordingly sectored as follows:

NM—NORTHSIDE MATURE

SM—SOUTHSIDE MATURE

NT—NORTHSIDE TRANSITION

ST—SOUTHSIDE TRANSITION

NJ—NORTHSIDE JUVENILE

SJ—SOUTHSIDE JUVENILE

Tissue is harvested from the plant trunk as well as from juvenile and mature form leaves. Samples are prepared simultaneously for phenotype analysis, including plant morphology and biochemical characteristics, and gene expression analysis. The height and diameter of the tree at the point from which each sector was taken is recorded, and a soil sample from the base of the tree is taken for chemical assay. Samples prepared for gene expression analysis are weighed and placed into liquid nitrogen for subsequent preparation of RNA samples for use in the microarray experiment. The tissues are denoted as follows:

P—phloem

C—cambium

X1—expanding xylem

X2—differentiating and lignifying xylem

Thin slices in tangential and radial sections from each of the sectors of the trunk are fixed as described in Ruzin, PLANT MICROTECHNIQUE AND MICROSCOPY, Oxford University Press, Inc., New York, N.Y. (1999) for anatomical examination and confirmation of wood developmental stage. Microfibril angle is examined at the different developmental stages of the wood, for example juvenile, transition and mature phases of Eucalyptus grandis wood. Other characteristics examined are the ratio of fibers to vessel elements and ray tissue in each sector. Additionally, the samples are examined for characteristics that change between juvenile and mature wood and between spring wood and summer wood, such as fiber morphology, lumen size, and width of the S2 (thickest) cell wall layer. Samples are further examined for measurements of density in the fifth ring and determination of modulus of elasticity using techniques well known to those skilled in the art of wood assays. See, e.g., Wang, et al., Non-destructive Evaluations of Trees, EXPERIMENTAL TECHNIQUES, pp. 28-30 (2000).

For biochemical analysis, 50 grams from each of the harvest samples are freeze-dried and analyzed, using biochemical assays well known to those skilled in the art of plant biochemistry for quantities of simple sugars, amino acids, lipids, other extractives, lignin, and cellulose. See, e.g., Pettersen & Schwandt, J. Wood Chem. & Technol. 11:495 (1991).

In the present example, the phenotypes chosen for comparison are high density wood, average density wood, and low density wood. Nucleic acid samples are prepared as described in Example 3, from trees harvested in the spring and summer. Gene expression profiling by hybridization and data analysis is performed as described above.

Using similar techniques and clonally propagated individuals one can examine polysaccharide gene expression as it is related to other complex wood characteristics such as strength, stiffness and spirality.

Example 32

Example 32 demonstrates the use of a vascular-preferred promoter functionally linked to one of the genes of the instant application.

A vascular-preferred promoter is then linked to one of the genes in the instant application and used to transform tree species. Boosted transcript levels of the candidate gene in the xylem of the transformants results in an increased xylem biomass phenotype.

In another example, a vascular-preferred promoter such as any of those in ArborGen's November 2003 patent applications is then linked to an RNAi construct containing sequences from one of the genes in the instant application and used to transform a tree of the genus from which the gene was isolated. Reduced transcript levels of the candidate gene in the xylem of the transformants results in an increased xylem biomass phenotype.

Example 33

The vector pARB476 was developed using the following steps. The Bluescript vector (Stratagene, La Jolla, Calif.) was modified by adding the Superubiquitin 3′UTR and nos 3′terminator sequence at the KpnI and ClaI sites to produce the vector pARB005 (SEQ ID NO. 773). To this vector the P. radiata superubiquitin promoter with intron was added. The promoter/intron sequence was first amplified from the P. radiata superubiquitin sequence identified in U.S. Pat. No. 6,380,459 using standard PCR techniques and the primers of SEQ ID NOS 774 and 775. The amplified fragment was then ligated into pARB005 using XbaI and PstI restriction digestion to produce the vector pARB119 (SEQ ID NO. 776).

The poplus tremuloises UDB Glucose binding domain gene (patent WO 0071670, ptCelA Genbank number AF072131) was amplified using standard PCR techniques and primers including and ATG and a ClaI site as part of the 5′ primer and a TGA and a ClaI site as part of the 3′ primer. The amplified fragment was then cloned into the ClaI site of pARB119 to produce the vector pARB476 (SEQ ID NO. 777).

The NotI cassette containing the P. radiata superubiquitin promoter with intron::UDP Glucose Binding domain::3′UTR: nos terminator from pARB476 was removed and cloned into the NotI site of pART29 to produce the vector pARB483. The binary vector pART29 is a modified pART27 vector (Gleave, Plant Mol. Biol. 20:1203-1207, 1992) that contains the Arabidopsis thaliana ubiquitin 3 (UBQ3) promoter instead of the nos5′ promoter and no lacZ sequences.

SEQ ID 773
CGATGGGTGTTATTTGTGGATAATAAATTCGGGTGATGTTCAGTGTTTGTCGTATTTCTCACGAATAAA
TTGTGTTTATGTATGTGTTAGTGTTGTTTGTCTGTTTCAGACCCTCTTATGTTATATTTTTCTTTTCGT
CGGTCAGTTGAAGCCAATACTGGTGTCCTGGCCGGCACTGCAATACCATTTCGTTTAATATAAAGACTC
TGTTATCCGTGAGCTCGAATTTCCCCGATCGTTCAAACATTTGGCAATAAAGTTTCTTAAGATTGAATC
CTGTTGCCGGTCTTGCGATGATTATCATATAATTTCTGTTGAATTACGTTAAGCATGTAATAATTAACA
TGTAATGCATGACGTTATTTATGAGATGGGTTTTTATGATTAGAGTCCCGCAATTATACATTTAATACG
CGATAGAAAACAAAATATAGCGCGCAAACTAGGATAAATTATCGCGCGCGGTGTCATCTATGTTACTAG
ATCGCGGCCGCATTTAAATGGTACCCAATTCGCCCTATAGTGAGTCGTATTACGCGCGCTCACTGGCCG
TCGTTTTACAACGTCGTGACTGGGAAAACCCTGGCGTTACCCAACTTAATCGCCTTGCAGCACATCCCC
CTTTCGCCAGCTGGCGTAATAGCGAAGAGGCCCGCACCGATCGCCCTTCCCAACAGTTGCGCAGCCTGA
ATGGCGAATGGGACGCGCCCTGTAGCGGCGCATTAAGCGCGGCGGGTGTGGTGGTTACGCGCAGCGTGA
CCGCTACACTTGCCAGCGCCCTAGCGCCCGCTCCTTTCGCTTTCTTCCCTTCCTTTCTCGCCACGTTCG
CCGGCTTTCCCCGTCAAGCTCTAAATCGGGGGCTCCCTTTAGGGTTCCGATTTAGTGCTTTACGGCACC
TCGACCCCAAAAAACTTGATTAGGGTGATGGTTCACGTAGTGGGCCATCGCCCTGATAGACGGTTTTTC
GCCCTTTGACGTTGGAGTCCACGTTCTTTAATAGTGGACTCTTGTTCCAAACTGGAACAACACTCAACC
CTATCTCGGTCTATTCTTTTGATTTATAAGGGATTTTGCCGATTTCGGCCTATTGGTTAAAAAATGAGC
TGATTTAACAAAAATTTAACGCGAATTTTAACAAAATATTAACGCTTACAATTTAGGTGGCACTTTTCG
GGGAAATGTGCGCGGAACCCCTATTTGTTTATTTTTCTAAATACATTCAAATATGTATCCGCTCATGAG
ACAATAACCCTGATAAATGCTTCAATAATATTGAAAAAGGAAGAGTATGAGTATTCAACATTTCCGTGT
CGCCCTTATTCCCTTTTTTGCGGCATTTTGCCTTCCTGTTTTTGCTCACCCAGAAACGCTGGTGAAAGT
AAAAGATGCTGAAGATCAGTTGGGTGCACGAGTGGGTTACATCGAACTGGATCTCAACAGCGGTAAGAT
CCTTGAGAGTTTTCGCCCCGAAGAACGTTTTCCAATGATGAGCACTTTTAAAGTTCTGCTATGTGGCGC
GGTATTATCCCGTATTGACGCCGGGCAAGAGCAACTCGGTCGCCGCATACACTATTCTCAGAATGACTT
GGTTGAGTACTCACCAGTCACAGAAAAGCATCTTACGGATGGCATGACAGTAAGAGAATTATGCAGTGC
TGCCATAACCATGAGTGATAACACTGCGGCCAACTTACTTCTGACAACGATCGGAGGACCGAAGGAGCT
AACCGCTTTTTTGCACAACATGGGGGATCATGTAACTCGCCTTGATCGTTGGGAACCGGAGCTGAATGA
AGCCATACCAAACGACGAGCGTGACACCACGATGCCTGTAGCAATGGCAACAACGTTGCGCAAACTATT
AACTGGCGAACTACTTACTCTAGCTTCCCGGCAACAATTAATAGACTGGATGGAGGCGGATAAAGTTGC
AGGACCACTTCTGCGCTCGGCCCTTCCGGCTGGCTGGTTTATTGCTGATAAATCTGGAGCCGGTGAGCG
TGGGTCTCGCGGTATCATTGCAGCACTGGGGCCAGATGGTAAGCCCTCCCGTATCGTAGTTATCTACAC
GACGGGGAGTCAGGCAACTATGGATGAACGAAATAGACAGATCGCTGAGATAGGTGCCTCACTGATTAA
GCATTGGTAACTGTCAGACCAAGTTTACTCATATATACTTTAGATTGATTTAAAACTTCATTTTTAATT
TAAAAGGATCTAGGTGAAGATCCTTTTTGATAATCTCATGACCAAAATCCCTTAACGTGAGTTTTCGTT
CCACTGAGCGTCAGACCCCGTAGAAAAGATCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAAT
CTGCTGCTTGCAAACAAAAAAACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGAGCTACCAAC
TCTTTTTCCGAAGGTAACTGGCTTCAGCAGAGCGCAGATACCAAATACTGTCCTTCTAGTGTAGCCGTA
GTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATACCTCGCTCTGCTAATCCTGTTACCAGT
GGCTGCTGCCAGTGGCGATAAGTCGTGTCTTACCGGGTTGGACTCAAGACGATAGTTACCGGATAAGGC
GCAGCGGTCGGGCTGAACGGGGGGTTCGTGCACACAGCCCAGCTTGGAGCGAACGACCTACACCGAACT
GAGATACCTACAGCGTGAGCTATGAGAAAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGGTATCC
GGTAAGCGGCAGGGTCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAAACGCCTGGTATCTTTA
TAGTCCTGTCGGGTTTCGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGGGGGGCGGAG
CCTATGGAAAAACGCCAGCAACGCGGCCTTTTTACGGTTCCTGGCCTTTTGCTGGCCTTTTGCTCACAT
GTTCTTTCCTGCGTTATCCCCTGATTCTGTGGATAACCGTATTACCGCCTTTGAGTGAGCTGATACCGC
TCGCCGCAGCCGAACGACCGAGCGCAGCGAGTCAGTGAGCGAGGAAGCGGAAGAGCGCCCAATACGCAA
ACCGCCTCTCCCCGCGCGTTGGCCGATTCATTAATGCAGCTGGCACGACAGGTTTCCCGACTGGAAAGC
GGGCAGTGAGCGCAACGCAATTAATGTGAGTTAGCTCACTCATTAGGCACCCCAGGCTTTACACTTTAT
GCTTCCGGCTCGTATGTTGTGTGGAATTGTGAGCGGATAACAATTTCACACAGGAAACAGCTATGACCA
TGATTACGCCAAGCGCGCAATTAACCCTCACTAAAGGGAACAAAAGCTGGGGCCGCTCTAGAACTAGTG
GATCCCCCGGGCTGCAGGAATTCGTCCAGCAGTTGTCTGGAGCTCCACCAGAAATCTGGAAGCTTAT
SEQ ID 774
AAATCTAGAGGTACCATTTAAATGCGGCCGCAAAACCCCTCACAAATACATAA
SEQ ID 775
TTTCTGCAGCTTGAAATTGAAATATGACTAACGAAT
SEQ ID 776
tctagaggtaccatttaaatgcggccgcaaaacccctcacaaatacataaaaaaaattctttatttaat
tatcaaactctccactacctttcccaccaaccgttacaatcctgaatgttggaaaaaactaactacatt
gatataaaaaaactacattacttcctaaatcatatcaaaattgtataaatatatccactcaaaggagtc
tagaagatccacttggacaaattgcccatagttggaaagatgttcaccaagtcaacaagatttatcaat
ggaaaaatccatctaccaaacttactttcaagaaaatccaaggattatagagtaaaaaatctatgtatt
attaagtcaaaaagaaaaccaaagtgaacaaatattgatgtacaagtttgagaggataagacattggaa
tcgtctaaccaggaggcggaggaattccctagacagttaaaagtggccggaatcccggtaaaaaagatt
aaaatttttttgtagagggagtgcttgaatcatgttttttatgatggaaatagattcagcaccatcaaa
aacattcaggacacctaaaattttgaagtttaacaaaaataacttggatctacaaaaatccgtatcgga
ttttctctaaatataactagaattttcataactttcaaagcaactcctcccctaaccgtaaaacttttc
ctacttcaccgttaattacattccttaagagtagataaagaaataaagtaaataaaagtattcacaaac
caacaatttatttcttttatttacttaaaaaaacaaaaagtttatttattttacttaaatggcataatg
acatatcggagatccctcgaacgagaatcttttatctccctggttttgtattaaaaagtaatttattgt
ggggtccacgcggagttggaatcctacagacgcgctttacatacgtctcgagaagcgtgacggatgtgc
gaccggatgaccctgtataacccaccgacacagccagcgcacagtatacacgtgtcatttctctattgg
aaaatgtcgttgttatccccgctggtacgcaaccaccgatggtgacaggtcgtctgttgtcgtgtcgcg
tagcgggagaagggtctcatccaacgctattaaatactcgccttcaccgcgttacttctcatcttttct
cttgcgttgtataatcagtgcgatattctcagagagcttttcattcaaaggtatggagttttgaagggc
tttactcttaacatttgtttttctttgtaaattgttaatggtggtttctgtgggggaagaatcttttgc
caggtccttttgggtttcgcatgtttatttgggttatttttctcgactatggctgacattactagggct
ttcgtgctttcatctgtgttttcttcccttaataggtctgtctctctggaatatttaattttcgtatgt
aagttatgagtagtcgctgtttgtaataggctcttgtctgtaaaggtttcagcaggtgtttgcgtttta
ttgcgtcatgtgtttcagaaggcctttgcagattattgcgttgtactttaatattttgtctccaacctt
gttatagtttccctcctttgatctcacaggaaccctttcttctttgagcattttcttgtggcgttctgt
agtaatattttaattttgggcccgggttctgagggtaggtgattattcacagtgatgtgctttccctat
aaggtcctctatgtgtaagctgttagggtttgtgcgttactattgacatgtcacatgtcacatattttc
ttcctcttatccttcgaactgatggttctttttctaattcgtggattgctggtgccatattttatttct
attgcaactgtattttagggtgtctctttctttttgatttcttgttaatatttgtgttcaggttgtaac
tatgggttgctagggtgtctgccctcttcttttgtgcttctttcgcagaatctgtccgttggtctgtat
ttgggtgatgaattatttattccttgaagtatctgtctaattagcttgtgatgatgtgcaggtatattc
gttagtcatatttcaatttcaagcgatcccccgggctgcaggaattcgtccagcagttgtctggagctc
caccagaaatctggaagcttatcgatgggtgttatttgtggataataaattcgggtgatgttcagtgtt
tgtcgtatttctcacgaataaattgtgtttatgtatgtgttagtgttgtttgtctgtttcagaccctct
tatgttatatttttcttttcgtcggtcagttgaagccaatactggtgtcctggccggcactgcaatacc
atttcgtttaatataaagactctgttatccgtgagctcgaatttccccgatcgttcaaacatttggcaa
taaagtttcttaagattgaatcctgttgccggtcttgcgatgattatcatataatttctgttgaattac
gttaagcatgtaataattaacatgtaatgcatgacgttatttatgagatgggtttttatgattagagtc
ccgcaattatacatttaatacgcgatagaaaacaaaatatagcgcgcaaactaggataaattatcgcgc
gcggtgtcatctatgttactagatcgcggccgcatttaaatggtacccaattcgccctatagtgagtcg
tattacgcgcgctcactggccgtcgttttacaacgtcgtgactgggaaaaccctggcgttacccaactt
aatcgccttgcagcacatccccctttcgccagctggcgtaatagcgaagaggcccgcaccgatcgccct
tcccaacagttgcgcagcctgaatggcgaatgggacgcgccctgtagcggcgcattaagcgcggcgggt
gtggtggttacgcgcagcgtgaccgctacacttgccagcgccctagcgcccgctcctttcgctttcttc
ccttcctttctcgccacgttcgccggctttccccgtcaagctctaaatcgggggctccctttagggttc
cgatttagtgctttacggcacctcgaccccaaaaaacttgattagggtgatggttcacgtagtgggcca
tcgccctgatagacggtttttcgccctttgacgttggagtccacgttctttaatagtggactcttgttc
caaactggaacaacactcaaccctatctcggtctattcttttgatttataagggattttgccgatttcg
gcctattggttaaaaaatgagctgatttaacaaaaatttaacgcgaattttaacaaaatattaacgctt
acaatttaggtggcacttttcggggaaatgtgcgcggaacccctatttgtttatttttctaaatacatt
caaatatgtatccgctcatgagacaataaccctgataaatgcttcaataatattgaaaaaggaagagta
tgagtattcaacatttccgtgtcgcccttattcccttttttgcggcattttgccttcctgtttttgctc
acccagaaacgctggtgaaagtaaaagatgctgaagatcagttgggtgcacgagtgggttacatcgaac
tggatctcaacagcggtaagatccttgagagttttcgccccgaagaacgttttccaatgatgagcactt
ttaaagttctgctatgtggcgcggtattatcccgtattgacgccgggcaagagcaactcggtcgccgca
tacactattctcagaatgacttggttgagtactcaccagtcacagaaaagcatcttacggatggcatga
cagtaagagaattatgcagtgctgccataaccatgagtgataacactgcggccaacttacttctgacaa
cgatcggaggaccgaaggagctaaccgcttttttgcacaacatgggggatcatgtaactcgccttgatc
gttgggaaccggagctgaatgaagccataccaaacgacgagcgtgacaccacgatgcctgtagcaatgg
caacaacgttgcgcaaactattaactggcgaactacttactctagcttcccggcaacaattaatagact
ggatggaggcggataaagttgcaggaccacttctgcgctcggcccttccggctggctggtttattgctg
ataaatctggagccggtgagcgtgggtctcgcggtatcattgcagcactggggccagatggtaagccct
cccgtatcgtagttatctacacgacggggagtcaggcaactatggatgaacgaaatagacagatcgctg
agataggtgcctcactgattaagcattggtaactgtcagaccaagtttactcatatatactttagattg
atttaaaacttcatttttaatttaaaaggatctaggtgaagatcctttttgataatctcatgaccaaaa
tcccttaacgtgagttttcgttccactgagcgtcagaccccgtagaaaagatcaaaggatcttcttgag
atcctttttttctgcgcgtaatctgctgcttgcaaacaaaaaaaccaccgctaccagcggtggtttgtt
tgccggatcaagagctaccaactctttttccgaaggtaactggcttcagcagagcgcagataccaaata
ctgtccttctagtgtagccgtagttaggccaccacttcaagaactctgtagcaccgcctacatacctcg
ctctgctaatcctgttaccagtggctgctgccagtggcgataagtcgtgtcttaccgggttggactcaa
gacgatagttaccggataaggcgcagcggtcgggctgaacggggggttcgtgcacacagcccagcttgg
agcgaacgacctacaccgaactgagatacctacagcgtgagctatgagaaagcgccacgcttcccgaag
ggagaaaggcggacaggtatccggtaagcggcagggtcggaacaggagagcgcacgagggagcttccag
ggggaaacgcctggtatctttatagtcctgtcgggtttcgccacctctgacttgagcgtcgatttttgt
gatgctcgtcaggggggcggagcctatggaaaaacgccagcaacgcggcctttttacggttcctggcct
tttgctggccttttgctcacatgttctttcctgcgttatcccctgattctgtggataaccgtattaccg
cctttgagtgagctgataccgctcgccgcagccgaacgaccgagcgcagcgagtcagtgagcgaggaag
cggaagagcgcccaatacgcaaaccgcctctccccgcgcgttggccgattcattaatgcagctggcacg
acaggtttcccgactggaaagcgggcagtgagcgcaacgcaattaatgtgagttagctcactcattagg
caccccaggctttacactttatgcttccggctcgtatgttgtgtggaattgtgagcggataacaatttc
acacaggaaacagctatgaccatgattacgccaagcgcgcaattaaccctcactaaagggaacaaaagc
tggggccgctctag
SEQ ID 777
TCTAGAGGTACCATTTAAATGCGGCCGCAAAACCCCTCACAAATACATAAAAAAAATTCTTTATTTAAT
TATCAAACTCTCCACTACCTTTCCCACCAACCGTTACAATCCTGAATGTTGGAAAAAACTAACTACATT
GATATAAAAAAACTACATTACTTCCTAAATCATATCAAAATTGTATAAATATATCCACTCAAAGGAGTC
TAGAAGATCCACTTGGACAAATTGCCCATAGTTGGAAAGATGTTCACCAAGTCAACAAGATTTATCAAT
GGAAAAATCCATCTACCAAACTTACTTTCAAGAAAATCCAAGGATTATAGAGTAAAAAATCTATGTATT
ATTAAGTCAAAAAGAAAACCAAAGTGAACAAATATTGATGTACAAGTTTGAGAGGATAAGACATTGGAA
TCGTCTAACCAGGAGGCGGAGGAATTCCCTAGACAGTTAAAAGTGGCCGGAATCCCGGTAAAAAAGATT
AAAATTTTTTTGTAGAGGGAGTGCTTGAATCATGTTTTTTATGATGGAAATAGATTCAGCACCATCAAA
AACATTCAGGACACCTAAAATTTTGAAGTTTAACAAAAATAACTTGGATCTACAAAAATCCGTATCGGA
TTTTCTCTAAATATAACTAGAATTTTCATAACTTTCAAAGCAACTCCTCCCCTAACCGTAAAACTTTTC
CTACTTCACCGTTAATTACATTCCTTAAGAGTAGATAAAGAAATAAAGTAAATAAAAGTATTCACAAAC
CAACAATTTATTTCTTTTATTTACTTAAAAAAACAAAAAGTTTATTTATTTTACTTAAATGGCATAATG
ACATATCGGAGATCCCTCGAACGAGAATCTTTTATCTCCCTGGTTTTGTATTAAAAAGTAATTTATTGT
GGGGTCCACGCGGAGTTGGAATCCTACAGACGCGCTTTACATACGTCTCGAGAAGCGTGACGGATGTGC
GACCGGATGACCCTGTATAACCCACCGACACAGCCAGCGCACAGTATACACGTGTCATTTCTCTATTGG
AAAATGTCGTTGTTATCCCCGCTGGTACGCAACCACCGATGGTGACAGGTCGTCTGTTGTCGTGTCGCG
TAGCGGGAGAAGGGTCTCATCCAACGCTATTAAATACTCGCCTTCACCGCGTTACTTCTCATCTTTTCT
CTTGCGTTGTATAATCAGTGCGATATTCTCAGAGAGCTTTTCATTCAAAGGTATGGAGTTTTGAAGGGC
TTTACTCTTAACATTTGTTTTTCTTTGTAAATTGTTAATGGTGGTTTCTGTGGGGGAAGAATCTTTTGC
CAGGTCCTTTTGGGTTTCGCATGTTTATTTGGGTTATTTTTCTCGACTATGGCTGACATTACTAGGGCT
TTCGTGCTTTCATCTGTGTTTTCTTCCCTTAATAGGTCTGTCTCTCTGGAATATTTAATTTTCGTATGT
AAGTTATGAGTAGTCGCTGTTTGTAATAGGCTCTTGTCTGTAAAGGTTTCAGCAGGTGTTTGCGTTTTA
TTGCGTCATGTGTTTCAGAAGGCCTTTGCAGATTATTGCGTTGTACTTTAATATTTTGTCTCCAACCTT
GTTATAGTTTCCCTCCTTTGATCTCACAGGAACCCTTTCTTCTTTGAGCATTTTCTTGTGGCGTTCTGT
AGTAATATTTTAATTTTGGGCCCGGGTTCTGAGGGTAGGTGATTATTCACAGTGATGTGCTTTCCCTAT
AAGGTCCTCTATGTGTAAGCTGTTAGGGTTTGTGCGTTACTATTGACATGTCACATGTCACATATTTTC
TTCCTCTTATCCTTCGAACTGATGGTTCTTTTTCTAATTCGTGGATTGCTGGTGCCATATTTTATTTCT
ATTGCAACTGTATTTTAGGGTGTCTCTTTCTTTTTGATTTCTTGTTAATATTTGTGTTCAGGTTGTAAC
TATGGGTTGCTAGGGTGTCTGCCCTCTTCTTTTGTGCTTCTTTCGCAGAATCTGTCCGTTGGTCTGTAT
TTGGGTGATGAATTATTTATTCCTTGAAGTATCTGTCTAATTAGCTTGTGATGATGTGCAGGTATATTC
GTTAGTCATATTTCAATTTCAAGCGATCCCCCGGGCTGCAGGAATTCGTCCAGCAGTTGTCTGGAGCTC
CACCAGAAATCTGGAAGCTTATCGATATGGATCAGTTCCCCAAGTGGAATCCTGTCAATAGAGAAACGT
ATATCGAAAGGCTGTCGGCAAGGTATGAAAGAGAGGGTGAGCCTTCTCAGCTTGCTGGTGTGGATTTTT
TCGTGAGTACTGTTGATCCGCTGAAGGAACCGCCATTGATCACTGCCAATACAGTCCTTTCCATCCTTG
CTGTGGACTATCCCGTCGATAAAGTCTCCTGCTACGTGTCTGATGATGGTGCAGCTATGCTTTCATTTG
AATCTCTTGTAGAAACAGCTGAGTTTGCAAGGAAGTGGGTTCCGTTCTGCAAAAAATTCTCAATTGAAC
CAAGAGCACCGGAGTTTTACTTCTCACAGAAAATTGATTACTTGAAAGACAAGGTTCAACCTTCTTTCG
TGAAAGAACGTAGAGCAATGAAAAGGGATTATGAAGAGTACAAAGTCCGAGTTAATGCCCTGGTAGCAA
AGGCTCAGAAAACACCTGAAGAAGGATGGACTATGCAAGATGGAACACCTTGGCCTGGGAATAACACAC
GTGATCACCCTGGCATGATTCAGGTCTTCCTTGGAAATACTGGAGCTCGTGACATTGAAGGAAATGAAC
TACCTCGTCTAGTATATGTCTCCAGGGAGAAGAGACCTGGCTACCAGCACCACAAAAAGGCTGGTGCAG
AAAATGCTCTGGTGAGAGTGTCTGCAGTACTCACAAATGCTCCCTACATCCTCAATGTTGATTGTGATC
ACTATGTAAACAATAGCAAGGCTGTTCGAGAGGCAATGTGCATCCTGATGGACCCACAAGTAGGTCGAG
ATGTATGCTATGTGCAGTTCCCTCAGAGGTTTGATGGCATAGATAAGAGTGATCGCTACGCCAATCGTA
ACGTAGTTTTCTTTGATGTTAACATGAAAGGGTTGGATGGCATTCAAGGACCAGTATACGTAGGAACTG
GTTGTGTTTTCAACAGGCAAGCACTTTACGGCTACGGGCCTCCTTCTATGCCCAGCTTACGCAAGAGAA
AGGATTCTTCATCCTGCTTCTCATGTTGCTGCCCCTCAAAGAAGAAGCCTGCTCAAGATCCAGCTGAGG
TATACAGAGATGCAAAAAGAGAGGATCTCAATGCTGCCATATTTAATCTTACAGAGATTGATAATTATG
ACGAGCATGAAAGGTCAATGCTGATCTCCCAGTTGAGCTTTGAGAAAACTTTTGGCTTATCTTCTGTCT
TCATTGAGTCTACACTAATGGAGAATGGAGGAGTACCCGAGTCTGCCAACTCACCAACACTCATCAAGG
AAGCAATTCATGTCATCGGCTGTGGCTATGAAGAGAAGACTGAATGGGGAAAAGAGATTGGTTGGATAT
ATGGGTCAGTCACTGAGGATATCTTAAGTGGCTTCAAGATGCACTGCCGAGGATGGAGATCAATTTACT
GCATGCCCGTAAGGCCTGCATTCAAAGGATCTGCACCCATCAACCTGTCTGATAGATTGCACCAGGTCC
TCCGATGGGCTCTTGGTTCTGTGGAAATTTTCTTTAGCAGACACTGTCCCCTCTGGTACGGGTTTGGAG
GAGGCCGTCTTAAATGGCTCCAAAGGCTTGCGTATATAAACACCATTGTGTACCCATGAATCGATGGGT
GTTATTTGTGGATAATAAATTCGGGTGATGTTCAGTGTTTGTCGTATTTCTCACGAATAAATTGTGTTT
ATGTATGTGTTAGTGTTGTTTGTCTGTTTCAGACCCTCTTATGTTATATTTTTCTTTTCGTCGGTCAGT
TGAAGCCAATACTGGTGTCCTGGCCGGCACTGCAATACCATTTCGTTTAATATAAAGACTCTGTTATCC
GTGAGCTCGAATTTCCCCGATCGTTCAAACATTTGGCAATAAAGTTTCTTAAGATTGAATCCTGTTGCC
GGTCTTGCGATGATTATCATATAATTTCTGTTGAATTACGTTAAGCATGTAATAATTAACATGTAATGC
ATGACGTTATTTATGAGATGGGTTTTTATGATTAGAGTCCCGCAATTATACATTTAATACGCGATAGAA
AACAAAATATAGCGCGCAAACTAGGATAAATTATCGCGCGCGGTGTCATCTATGTTACTAGATCGCGGC
CGCATTTAAATGGTACCCAATTCGCCCTATAGTGAGTCGTATTACGCGCGCTCACTGGCCGTCGTTTTA
CAACGTCGTGACTGGGAAAACCCTGGCGTTACCCAACTTAATCGCCTTGCAGCACATCCCCCTTTCGCC
AGCTGGCGTAATAGCGAAGAGGCCCGCACCGATCGCCCTTCCCAACAGTTGCGCAGCCTGAATGGCGAA
TGGGACGCGCCCTGTAGCGGCGCATTAAGCGCGGCGGGTGTGGTGGTTACGCGCAGCGTGACCGCTACA
CTTGCCAGCGCCCTAGCGCCCGCTCCTTTCGCTTTCTTCCCTTCCTTTCTCGCCACGTTCGCCGGCTTT
CCCCGTCAAGCTCTAAATCGGGGGCTCCCTTTAGGGTTCCGATTTAGTGCTTTACGGCACCTCGACCCC
AAAAAACTTGATTAGGGTGATGGTTCACGTAGTGGGCCATCGCCCTGATAGACGGTTTTTCGCCCTTTG
ACGTTGGAGTCCACGTTCTTTAATAGTGGACTCTTGTTCCAAACTGGAACAACACTCAACCCTATCTCG
GTCTATTCTTTTGATTTATAAGGGATTTTGCCGATTTCGGCCTATTGGTTAAAAAATGAGCTGATTTAA
CAAAAATTTAACGCGAATTTTAACAAAATATTAACGCTTACAATTTAGGTGGCACTTTTCGGGGAAATG
TGCGCGGAACCCCTATTTGTTTATTTTTCTAAATACATTCAAATATGTATCCGCTCATGAGACAATAAC
CCTGATAAATGCTTCAATAATATTGAAAAAGGAAGAGTATGAGTATTCAACATTTCCGTGTCGCCCTTA
TTCCCTTTTTTGCGGCATTTTGCCTTCCTGTTTTTGCTCACCCAGAAACGCTGGTGAAAGTAAAAGATG
CTGAAGATCAGTTGGGTGCACGAGTGGGTTACATCGAACTGGATCTCAACAGCGGTAAGATCCTTGAGA
GTTTTCGCCCCGAAGAACGTTTTCCAATGATGAGCACTTTTAAAGTTCTGCTATGTGGCGCGGTATTAT
CCCGTATTGACGCCGGGCAAGAGCAACTCGGTCGCCGCATACACTATTCTCAGAATGACTTGGTTGAGT
ACTCACCAGTCACAGAAAAGCATCTTACGGATGGCATGACAGTAAGAGAATTATGCAGTGCTGCCATAA
CCATGAGTGATAACACTGCGGCCAACTTACTTCTGACAACGATCGGAGGACCGAAGGAGCTAACCGCTT
TTTTGCACAACATGGGGGATCATGTAACTCGCCTTGATCGTTGGGAACCGGAGCTGAATGAAGCCATAC
CAAACGACGAGCGTGACACCACGATGCCTGTAGCAATGGCAACAACGTTGCGCAAACTATTAACTGGCG
AACTACTTACTCTAGCTTCCCGGCAACAATTAATAGACTGGATGGAGGCGGATAAAGTTGCAGGACCAC
TTCTGCGCTCGGCCCTTCCGGCTGGCTGGTTTATTGCTGATAAATCTGGAGCCGGTGAGCGTGGGTCTC
GCGGTATCATTGCAGCACTGGGGCCAGATGGTAAGCCCTCCCGTATCGTAGTTATCTACACGACGGGGA
GTCAGGCAACTATGGATGAACGAAATAGACAGATCGCTGAGATAGGTGCCTCACTGATTAAGCATTGGT
AACTGTCAGACCAAGTTTACTCATATATACTTTAGATTGATTTAAAACTTCATTTTTAATTTAAAAGGA
TCTAGGTGAAGATCCTTTTTGATAATCTCATGACCAAAATCCCTTAACGTGAGTTTTCGTTCCACTGAG
CGTCAGACCCCGTAGAAAAGATCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAATCTGCTGCT
TGCAAACAAAAAAACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGAGCTACCAACTCTTTTTC
CGAAGGTAACTGGCTTCAGCAGAGCGCAGATACCAAATACTGTCCTTCTAGTGTAGCCGTAGTTAGGCC
ACCACTTCAAGAACTCTGTAGCACCGCCTACATACCTCGCTCTGCTAATCCTGTTACCAGTGGCTGCTG
CCAGTGGCGATAAGTCGTGTCTTACCGGGTTGGACTCAAGACGATAGTTACCGGATAAGGCGCAGCGGT
CGGGCTGAACGGGGGGTTCGTGCACACAGCCCAGCTTGGAGCGAACGACCTACACCGAACTGAGATACC
TACAGCGTGAGCTATGAGAAAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGGTATCCGGTAAGCG
GCAGGGTCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAAACGCCTGGTATCTTTATAGTCCTG
TCGGGTTTCGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGGGGGGCGGAGCCTATGGA
AAAACGCCAGCAACGCGGCCTTTTTACGGTTCCTGGCCTTTTGCTGGCCTTTTGCTCACATGTTCTTTC
CTGCGTTATCCCCTGATTCTGTGGATAACCGTATTACCGCCTTTGAGTGAGCTGATACCGCTCGCCGCA
GCCGAACGACCGAGCGCAGCGAGTCAGTGAGCGAGGAAGCGGAAGAGCGCCCAATACGCAAACCGCCTC
TCCCCGCGCGTTGGCCGATTCATTAATGCAGCTGGCACGACAGGTTTCCCGACTGGAAAGCGGGCAGTG
AGCGCAACGCAATTAATGTGAGTTAGCTCACTCATTAGGCACCCCAGGCTTTACACTTTATGCTTCCGG
CTCGTATGTTGTGTGGAATTGTGAGCGGATAACAATTTCACACAGGAAACAGCTATGACCATGATTACG
CCAAGCGCGCAATTAACCCTCACTAAAGGGAACAAAAGCTGGGGCCGCTCTAG

TABLE 8
pGrowth Information.
CW AR Plasmid(s) Promoter Gene Genesis ID
88 pGrowth14 SUBIN Cyclin A prga001823
88 pGrowth15 SUBIN Cyclin A prpe001264
88 pGrowth16 SUBIN Cyclin D prxa004540
88 pGrowth18 SUBIN Cyclin D prxl006271
88 pGrowth19 SUBIN Cyclin D prpb019661
88 pGrowth20 SUBIN WEE1-like protein prrd041233

To make the growth100 plasmids, an acceptor vector (pWVK202) was built by first inserting the NotI-SUBIN::UDPGBD::nos term-NotI cassette from pARB483a into plasmid pWVK147 at NotI. Next, the UDPGBD gene was removed using restriction sites PstI and ClaI. A polylinker containing the restriction sites PstI, NheI, AvrII, ScaI, and ClaI was inserted in place of the UDPGBD gene. Sites AvrII and NheI are both compatible with SpeI, a site found often in the plasmids provided by Genesis. ScaI is blunt, so any fragment can be blunted and then inserted at that position into the acceptor vector. Plasmids were received from Genesis and analyzed to determine which restriction sites would be most suitable for subcloning into the acceptor vector pWVK202. After the ligations were performed, the resulting products were checked by extensive restriction digest analysis to make sure that the desired plasmid had been created.

TABLE 9
Eucalyptus grandis Cell Cycle Genes and Proteins.
Patent Patent
DNA SEQ Protein SEQ ORF ORF
ID NO ID NO Sequence Identifier start stop
1 236 eucalyptusSpp_003910 387 1820
2 237 eucalyptusSpp_019213 99 1007
3 238 eucalyptusSpp_036800 120 1004
4 239 eucalyptusSpp_040260 23 937
5 240 eucalyptusSpp_041965 149 1033
6 241 eucalyptusSpp_002906 199 1116
7 242 eucalyptusSpp_001518 41 982
8 243 eucalyptusSpp_008078 291 2042
9 244 eucalyptusSpp_009826 107 2236
10 245 eucalyptusSpp_010364 82 1749
11 246 eucalyptusSpp_011523 151 1560
12 247 eucalyptusSpp_024358 82 1644
13 248 eucalyptusSpp_039125 626 2782
14 249 eucalyptusSpp_005362 13 1467
15 250 eucalyptusSpp_044857 113 1558
16 251 eucalyptusSpp_001743 187 1686
17 252 eucalyptusSpp_012405 238 1653
18 253 eucalyptusSpp_003739 235 1539
19 254 eucalyptusSpp_022338 158 1618
20 255 eucalyptusSpp_028605 205 1530
21 256 eucalyptusSpp_041006 174 1499
22 257 eucalyptusSpp_006643 94 1332
23 258 eucalyptusSpp_045338 176 1342
24 259 eucalyptusSpp_046486 150 1283
25 260 eucalyptusSpp_012070 101 367
26 261 eucalyptusSpp_006617 9 1352
27 262 eucalyptusSpp_007827 89 1486
28 263 eucalyptusSpp_008036 80 1477
29 264 010212EGLA007017HT 160 1062
30 265 eucalyptusSpp_001596 172 1077
31 266 eucalyptusSpp_005870 66 989
32 267 eucalyptusSpp_006901 111 1541
33 268 eucalyptusSpp_006902 116 1615
34 269 eucalyptusSpp_007440 155 1453
35 270 eucalyptusSpp_008994 228 2033
36 271 eucalyptusSpp_024580 110 1258
37 272 eucalyptusSpp_037831 50 1462
38 273 eucalyptusSpp_034958 176 739
39 274 001209EGXC004488HT 150 1529
40 275 010310EGXD012820HT 247 1971
41 276 010310EGXD013036HT 136 1644
42 277 010316EGXF999037HT 48 836
43 278 010324EGXF002118HT 49 822
44 279 011019EGKA001923HT 185 751
45 280 eucalyptusSpp_000966 103 621
46 281 eucalyptusSpp_001037 41 559
47 282 eucalyptusSpp_004603 127 693
48 283 eucalyptusSpp_005465 28 639
49 284 eucalyptusSpp_006571 135 812
50 285 eucalyptusSpp_006786 119 613
51 286 eucalyptusSpp_007057 38 562
52 287 eucalyptusSpp_008670 109 1872
53 288 eucalyptusSpp_009137 74 1159
54 289 eucalyptusSpp_010285 54 2045
55 290 eucalyptusSpp_010600 53 1879
56 291 eucalyptusSpp_011551 7 690
57 292 eucalyptusSpp_020743 83 601
58 293 eucalyptusSpp_023739 125 535
59 294 eucalyptusSpp_024103 55 573
60 295 eucalyptusSpp_031985 147 842
61 296 eucalyptusSpp_032025 167 487
62 297 eucalyptusSpp_032173 195 890
63 298 eucalyptusSpp_033340 68 586
64 299 eucalyptusSpp_009143 182 3265
65 300 eucalyptusSpp_000349 165 1145
66 301 eucalyptusSpp_000575 529 1569
67 302 eucalyptusSpp_000804 156 1136
68 303 eucalyptusSpp_000805 90 1073
69 304 eucalyptusSpp_000806 66 1049
70 305 eucalyptusSpp_002248 277 1512
71 306 eucalyptusSpp_003203 33 1076
72 307 eucalyptusSpp_003209 65 973
73 308 eucalyptusSpp_004429 82 1047
74 309 eucalyptusSpp_004607 43 1101
75 310 eucalyptusSpp_004682 142 1095
76 311 eucalyptusSpp_005786 61 1257
77 312 eucalyptusSpp_005887 193 1527
78 313 eucalyptusSpp_005981 109 1155
79 314 eucalyptusSpp_006766 71 1213
80 315 eucalyptusSpp_006769 109 1785
81 316 eucalyptusSpp_006907 364 2685
82 317 eucalyptusSpp_007518 96 1412
83 318 eucalyptusSpp_007717 116 1702
84 319 eucalyptusSpp_007718 46 1101
85 320 eucalyptusSpp_007741 23 1258
86 321 eucalyptusSpp_007884 404 2644
87 322 eucalyptusSpp_008258 107 2383
88 323 eucalyptusSpp_008465 243 1625
89 324 eucalyptusSpp_008616 126 1127
90 325 eucalyptusSpp_008690 257 1390
91 326 eucalyptusSpp_008708 178 1632
92 327 eucalyptusSpp_008850 290 2917
93 328 eucalyptusSpp_009072 148 1197
94 329 eucalyptusSpp_009465 140 1567
95 330 eucalyptusSpp_009472 376 1737
96 331 eucalyptusSpp_009550 69 1010
97 332 eucalyptusSpp_010284 149 1423
98 333 eucalyptusSpp_010595 365 2677
99 334 eucalyptusSpp_010657 24 923
100 335 eucalyptusSpp_012636 221 3598
101 336 eucalyptusSpp_012748 44 1447
102 337 eucalyptusSpp_012879 196 1314
103 338 eucalyptusSpp_015515 193 1668
104 339 eucalyptusSpp_015724 78 1634
105 340 eucalyptusSpp_016167 85 2826
106 341 eucalyptusSpp_016633 74 1246
107 342 eucalyptusSpp_017485 100 4377
108 343 eucalyptusSpp_018007 58 2439
109 344 eucalyptusSpp_020775 159 1064
110 345 eucalyptusSpp_023132 118 1665
111 346 eucalyptusSpp_023569 57 1628
112 347 eucalyptusSpp_023611 250 1566
113 348 eucalyptusSpp_024934 106 1434
114 349 eucalyptusSpp_025546 190 1917
115 350 eucalyptusSpp_030134 102 2942
116 351 eucalyptusSpp_031787 75 1079
117 352 eucalyptusSpp_034435 99 1148
118 353 eucalyptusSpp_034452 232 1806
119 354 eucalyptusSpp_035789 72 1124
120 355 eucalyptusSpp_035804 315 2069
121 356 eucalyptusSpp_043057 145 1968
122 357 eucalyptusSpp_046741 130 1488
123 358 eucalyptusSpp_047161 269 1693
698 718 eucalyptusSpp_008994
699 719 eucalyptusSpp_009143
700 720 eucalyptusSpp_006366
701 721 eucalyptusSpp_006907
702 722 eucalyptusSpp_012636
703 723 eucalyptusSpp_015724
704 724 eucalyptusSpp_016167
705 725 eucalyptusSpp_017485
706 726 eucalyptusSpp_030134
707 727 eucalyptusSpp_046741
708 728 eucalyptusSpp_047161
709 729 eucalyptusSpp_017378

TABLE 10
Pinus radiata cell cycle genes and proteins.
Patent Patent
DNA SEQ Protein SEQ ORF ORF
ID NO ID NO Sequence Identifier start stop
124 359 pinusRadiata_001766 1163 2545
125 360 pinusRadiata_002927 152 1582
126 361 990309PRCA009171HT 389 1297
127 362 pinusRadiata_013714 38 946
128 363 pinusRadiata_016332 180 1088
129 364 pinusRadiata_021677 40 948
130 365 pinusRadiata_027562 229 1134
131 366 pinusRadiata_001504 105 2642
132 367 pinusRadiata_015211 187 2580
133 368 pinusRadiata_020421 220 1749
134 369 pinusRadiata_003187 438 1748
135 370 pinusRadiata_015661 240 1631
136 371 pinusRadiata_013874 252 1604
137 372 pinusRadiata_014615 261 1817
138 373 pinusRadiata_004578 167 1576
139 374 pinusRadiata_023387 183 1598
140 375 pinusRadiata_006970 98 1126
141 376 pinusRadiata_010322 148 894
142 377 pinusRadiata_022721 287 1363
143 378 pinusRadiata_023407 251 1348
144 379 pinusRadiata_001945 229 510
145 380 pinusRadiata_008233 92 409
146 381 pinusRadiata_008234 64 381
147 382 pinusRadiata_022054 68 349
148 383 pinusRadiata_012137 125 1849
149 384 pinusRadiata_012582 70 1602
150 385 pinusRadiata_015285 140 1465
151 386 pinusRadiata_017229 628 2565
152 387 pinusRadiata_020724 55 1818
153 388 pinusRadiata_004555 259 1710
154 389 pinusRadiata_004556 356 1807
155 390 pinusRadiata_005729 261 1298
156 391 pinusRadiata_007395 365 2251
157 392 pinusRadiata_009503 156 1454
158 393 pinusRadiata_011283 203 1348
159 394 pinusRadiata_012322 229 1644
160 395 pinusRadiata_018671 156 1454
161 396 pinusRadiata_023236 27 2222
162 397 pinusRadiata_000171 71 1759
163 398 pinusRadiata_000172 358 2040
164 399 pinusRadiata_001480 238 756
165 400 pinusRadiata_001481 285 803
166 401 pinusRadiata_001483 190 708
167 402 pinusRadiata_001484 156 674
168 403 pinusRadiata_001692 176 1912
169 404 pinusRadiata_005313 64 765
170 405 pinusRadiata_006362 93 881
171 406 pinusRadiata_006493 372 1070
172 407 pinusRadiata_006983 28 594
173 408 pinusRadiata_006984 34 648
174 409 pinusRadiata_007665 481 1611
175 410 pinusRadiata_012196 93 584
176 411 pinusRadiata_013382 250 1869
177 412 pinusRadiata_016461 84 422
178 413 pinusRadiata_017611 128 1213
179 414 pinusRadiata_019776 265 837
180 415 pinusRadiata_020659 38 781
181 416 pinusRadiata_022559 38 526
182 417 pinusRadiata_024188 37 1158
183 418 pinusRadiata_027973 61 768
184 419 pinusRadiata_001353 421 2172
185 420 pinusRadiata_001978 163 1647
186 421 pinusRadiata_002810 192 1172
187 422 pinusRadiata_002811 131 1111
188 423 pinusRadiata_002812 149 1726
189 424 pinusRadiata_003514 948 2228
190 425 pinusRadiata_004104 332 1465
191 426 pinusRadiata_005595 232 1590
192 427 pinusRadiata_005754 207 1550
193 428 pinusRadiata_006463 221 1171
194 429 pinusRadiata_006665 221 3679
195 430 pinusRadiata_006750 269 1252
196 431 pinusRadiata_007030 214 1242
197 432 pinusRadiata_007854 119 2065
198 433 pinusRadiata_007917 186 1550
199 434 pinusRadiata_007989 244 3671
200 435 pinusRadiata_008506 163 1431
201 436 pinusRadiata_008692 155 1081
202 437 pinusRadiata_008693 537 1463
203 438 pinusRadiata_009170 284 1909
204 439 pinusRadiata_009408 610 1659
205 440 pinusRadiata_009522 241 1452
206 441 pinusRadiata_009734 223 1173
207 442 pinusRadiata_009815 251 1777
208 443 pinusRadiata_010670 367 1419
209 444 pinusRadiata_011297 284 1303
210 445 pinusRadiata_013098 684 1784
211 446 pinusRadiata_013172 336 2738
212 447 pinusRadiata_013589 81 1622
213 448 pinusRadiata_013608 399 1460
214 449 pinusRadiata_014299 207 1673
215 450 pinusRadiata_014498 263 1309
216 451 pinusRadiata_014548 232 2529
217 452 pinusRadiata_014610 56 2950
218 453 pinusRadiata_015460 56 1234
219 454 pinusRadiata_016090 193 2577
220 455 pinusRadiata_016722 187 1233
221 456 pinusRadiata_016785 51 1436
222 457 pinusRadiata_017094 525 2351
223 458 pinusRadiata_017527 152 1099
224 459 pinusRadiata_017591 470 4114
225 460 pinusRadiata_017769 196 2007
226 461 pinusRadiata_018047 214 1323
227 462 pinusRadiata_018414 68 2146
228 463 pinusRadiata_018986 874 3705
229 464 pinusRadiata_019479 360 1754
230 465 pinusRadiata_020144 185 1384
231 466 pinusRadiata_022480 241 1533
232 467 pinusRadiata_023079 230 1435
233 468 pinusRadiata_026739 101 2857
234 469 pinusRadiata_026951 43 1548
235 470 pinusRadiata_026529 206 1657
710 730 pinusRadiata_000888
711 731 pinusRadiata_004578
712 732 pinusRadiata_007989
713 733 pinusRadiata_009522
714 734 pinusRadiata_014610
715 735 pinusRadiata_017591
716 736 pinusRadiata_017769
717 737 pinusRadiata_026951

TABLE 11
Annotated Peptide Sequences of the Present Invention.
Entry Sequence Description Annotated Peptide Sequence
1 The amino acid sequence of SEQ ID MGDGSLGSGGRGNSGGGGGGGSRPEWLQQYDLIGKIGEG
261. The conserved eukaryotic TYGLVFLARIKHPSTNRGKYIAIKKFKQSKDGDGVSPTA
protein kinase domain is IREIMLLREISHENVVKLVNVHINPVDMSLYLAFDYADH
underlined and the DLYEIIRHHRDKVNQAINPYTVKSLLWQLLNGLNYLHSN
serine/threonine protein kinases WIIHRDLKPSNILVMGEGEEQGVVKIADFGLARVYQAPL
active-site signature is in bold. KPLSDNGVVVTIWYRAPELLLGAKHYTSAVDMWAVGCIF
AELLTLKPLFQGQEVKANPNPFQLDQLDKIFKVLGHPTQ
EKWPMLVNLPHWQSDVQHIQRHKYDDNALGNVVRLSSKN
ATFDLLSKMLEYDPQKRITAAQALEHEYFRMEPLPGRNA
LVPSSPGDKVNYPTRPVDTTTDIEGTTSLQPSQSASSGN
AVPGNMPGPHVVTNRPMPRPMHMVGMQRVPASGMAGYNL
NPSGMGGGMNPSGIPMQRGVANQAQQSRRKDPGMGMGGY
PPQQKQRRF
2 The amino acid sequence of SEQ ID MEKYQQLAKIGEGTYGIVYKAKDKKSGELLALKKIRLEA
262. The conserved eukaryotic EDEGIPSTAIREISLLKQLQHPNIVRLYDVVHTEKKLTL
protein kinase domain is VFEFLDQDLKKYLDACGDNGLEPYTVKSFLYQLLQGIAF
underlined and the protein kinases CHEHRVLHRDLKPQNLLINMEGELKLADFGLARAFGIPV
ATP-binding region and RNYTHEVVTLWYRAPDVLMGSRKYSTQVDIWSVGCIFAE
serine/threonine protein kinases MVNGRPLFPGSSEQDQLLRIFKTLGTPSLKTWPGMAELP
active-site signatures are in DFKDNFPKYVVQSFKKICPKKLDKTGLDLLSRMLQYDPA
bold. KRISAEQAMGHPYFKDLKLRKPKAAGPGP
3 The amino acid sequence of SEQ ID MDQYEKIEKIGEGTYGVVYKAIDRSTNKTIALKKIRLEQ
263. The conserved eukaryotic EDEGVPSTAIREISLLKEMQHGNIVKLQDVVHSERRLYL
protein kinase domain is VFEYLDLDLKKHMDSCPEFSKDTHTIKMFLYQILRGISY
underlined and the protein kinases CHSHRVLHRDLKPQNLLLDRRTNSLKLADFGLARAFGIP
ATP-binding region and VRTFTHEVVTLWYRAPEILLGSRHYSTPVDVWSVGCIFA
serine/threonine protein kinases EMVNRRPLFPGDSEIDELFKIFRIMGTPNEDSWPGVTSL
active-site signatures are in PDFKSTFPKWASQDLKTVTPTVDPAGIDLLSKMLCMDPR
bold. RRITAKVALEHEYFKDVGVIP
4 The amino acid sequence of SEQ ID MVMKSKLDKYEKLEKLGEGTYGVVYKAQDKTTKEIYALK
264. The conserved eukaryotic KIRLESEDEGIPSTAIREIALLKELQHPNVVRIHDVIHT
protein kinase domain is NKKLILVFEFVDYDLKKFLHNFDKGIDPKIVKSLLYQLV
underlined and the protein kinases RGVAHCHQQKVLHRDLKPQNLLVSQEGILKLGDFGLARA
ATP-binding region and FGIPVKNYTNEVVTLWYRAPDILLGSKNYSTSVDIWSIG
serine/threonine protein kinases CIFVEMLNQKPLFPGSSEQDQLKKIFKIMGTPDATKWPG
active-site signatures are in IAELPDWKPENFEKYPGEPLNKVCPKMDPDGLDLLDKML
bold. KCNPSERIAAKNAMSHPYFKDIPDNLKKLYN
5 The amino acid sequence of SEQ ID MDQYEKVEKIGEGTYGVVYKAIDRLTNETIALKKIRLEQ
265. The conserved eukaryotic EDEGVPSTAIREISLLKEMQHGNIVRLQDVVHSENRLYL
protein kinase domain is VFEYLDLDLKKHMDSSPDFAKDPRLVKIFLYQILRGIAY
underlined and the protein kinases CHSHRVLHRDLKPQNLLIDRRTNALKLADFGLARAFGIP
ATP-binding region and VRTFTHEVVTLWYRAPEILLGSRHYSTPVDVWSVGCIFA
serine/threonine protein kinases EMVNQRPLFPGDSEIDELFKIFRILGTPNEDTWPGVTAL
active-site signatures are in PDFKSAFPKWPAKNLQDMVPGLNSAGIDLLSKMLCLDPS
bold. KRITARSALEHEYFKDIGFVP
6 The amino acid sequence of SEQ ID MEKYEKLEKVGEGTYGKVYKAKDKATGQLVALKKTRLEM
266. The conserved eukaryotic DEEGVPPTALREVSLLQLLSQSLYVVRLLSVEHVDGGSK
protein kinase domain is RKAAAAAAAEGGGGEAHGGGAVGGGKPMLYLVFEYLDTD
underlined and the protein kinases LKKFIDSHRKGPNPRPVPAATVQNFLYQLLKGVAHCHSH
ATP-binding region and GVLHRDLKPQNLLVDKEKGILKIADLGLGRAFTVPLKSY
serine/threonine protein kinases THEVFAFLAILLWRSEGESAADFDSXFRVSPVQVVTLWY
active-site signatures are in RAPEVLLGSAHYSIGVDMWSVGCIFAEMVRRQALFPGDS
bold. EFQQLLHIFRLLGTPTEKQWPGVTTLRDWHVYPQWEPQN
LARAVPSLGPDGVDLLSKMLKYDPAERISAKAALDHPFF
DSLDKSQF
7 The amino acid sequence of SEQ ID MERPATAAVSAMEAFEKLEKVGEGTYGKVYRAREKATGK
267. The conserved eukaryotic IVALKKTRLHEDEEGVPPTTLREISILRMLSRDPHIVRL
protein kinase domain is MDVKQGQNKEGKTVLYLVFEYMETDLKKYIRGFRSSGES
underlined and the protein kinases IPVNIVKSLMYQLCKGVAFCHGHGVLHRDLKPHNLLMDK
ATP-binding region and KTLTLKIADLGLARAFTVPIKKYTHEILTLWYRAPEVLL
serine/threonine protein kinases GATHYSTAVDMWSVGCIFAELVTKQALFPGDSELQQLLH
active-site signatures are in IFRLLGTPNEKMWPGVSSLMNWHEYPQWKPQSLSTAVPN
bold. LDKDGLDLLSQMLHYEPSRRISAKAAMEHPYFDDVNKTCL
8 The amino acid sequence of SEQ ID MGCVLGREVSSGIVTESKGRDSSEVETSKRDDSVAAKVE
268. The conserved eukaryotic GEGKAEEVRTEETQKKEKVEDDQQSREQRRRSKPSTKLG
protein kinase domain is NLPKHIRGEQVAAGWPSWLSDICGEALNGWIPRRANTFE
underlined and the KIDKIGQGTYSNVYKAKDLLTGKIVALKKVRFDNLEPES
serine/threonine protein kinases VRFMAREILILRHLDHPNVVKLEGLVTSRMSCSLYLVFE
active-site signature is in bold. YMEHDLAGLAASPAIKFTEPQVKCYMHQLLSGLEHCHNR
RVLHRDIKGSNLLIDNGGVLKIGDFGLASFYDPDHKHRM
TSRVVTLWYRPPELLLGANDYGVGIDLWSAGCILAELLA
GKPIMPGRTEVEQLHKIYKLCGSPSEEYWKKYKLPNATL
FKPREPYRRCIRETFKDFPPSSLPLIETLLAIDPAERGT
ATDALQSEFFRTEPYACEPSSLPQYPPSKEMDAKKRDDE
ARRLRAASKGQADGSKKERTRDRRVRAVPAPEANAELQH
NIDRRRLISHANAKSKSEKFPPPHQDGALGFPLGASHRF
DPAVVPPDVPFTSTSFTSSKEHDQTWSGPLVDPPGAPRR
KKHSAGGQRESSKLSMGTNKGRRADSHLKAYESKSIA
9 The amino acid sequence of SEQ ID MYSKSSAVDDSRESPKDRVSSSRRLSEVKTSRLDSSRRE
269. The conserved eukaryotic NGFRARDKVGDVSVMLIDKKVNGSARFCDDQIEKKSDRL
protein kinase domain is QKQRRERAEAAAAADHPGAGRVPKAVEGEQVAAGWPVWL
underlined and the SAVAGEAIKGWLPRRADTFEKLDKIGQGTYSSVYKARDV
serine/threonine protein kinase TNNKIVALKRVRFDNLDTESVKFMAREIHILRMLDHPNV
active-site signature is in bold. IKLEGLITSRMSCSLYLVFEYMEHDLTGLASRPDVKFSE
PQIKCYMKQLLSGLDHCHKHGVLHRDIKGSNLLIDNNGI
LKIADFGLASVFDPHQTAPLTSRVVTLWYRPPELLLGAS
RYGVEVDLWSTGCILGELYTGKPILPGKTEVEQLHKIFK
LCGSPSDDYWRRLHLPHAAVFKPPQPYRRCVAEIFKELP
PVALGLLETLISVDPSQRGTAAFALRSEFFTASPLPCDP
SSLPKYPPSKEIDMKLREEEARRRGAAGGKNELEKRGTK
DSRTNSAYYPNAGQLQVKQCHSNANGRSEIFGPYQEKTV
SGFLVAPPKQARVSKETRKDYAEQPDRASFSGPLVPGPG
FSKAGKELGHSITVSRNTNLSTLSSLVTSRTGDNKQKSG
PLVSESANQASRYSGPIREMEPARKQDRRSHVRTNIDYR
SREDGNSSTKEPALYGRGSAGNKIYVSGPLLVSSNNVDQ
MLKEHDRRIQEHARRARFDKARVGNNHPQAAVDSKLVSV
HDAG
10 The amino acid sequence of SEQ ID MGCIPTIISDGRRRSAAPDKRRPRPRRSSSEGEAPPHAT
270. The conserved eukaryotic AAGSEGGESARGAPGKERPEPAPRFVVRSPQGWPPWLVA
protein kinase domain is AVGHAIGEFVPRCADSFRKLAKIGEGTYSNVYKARDLVT
underlined and the GKTVALKKVRFDNLEAESIKFMAREILVLTRLNHPNVIK
serine/threonine protein kinase LEGPVTSRMSSGLYLAFEYMEHDLSGIAARQNGKFTEPQ
active-site is in bold VKCFMRQLLSGLEHCHNHDVLHRDIKCSNLLIDNEGNLK
IADFGLATFYDPERKQVMTNRVVTLWYRAPELLLGATSY
GIGIDLWSAGCILAELLYGKPIMPGRTEVEQLHKIFKLC
GSPSEAYWNKFKLPNANIFKPPQPYARCIAETFKDFPPS
ALPLLETLLSIDPDERGTATTALNSEFFAAEPHACEPSS
LPKYPPSKEMDLKLIKEKTRRDSSKRPSAIHGSRRDGIH
DRAGRVIPAPEATAENQATLHRPRAMKKANPMSRSEKFP
PAHMDGVVGSSANAWLSGPASNAAPDSRRHRSLNQNPSS
SVGKASTGSSTTQETLKVAPELLQVGSSSLHPCHRMLVY
GSNLTIRSK
11 The amino acid sequence of SEQ ID MGCICAKQADRGPASPGSGILTGAGTGTGTRSSKIPSGL
271. The conserved protein kinase FEFEKSGVKEHGGRSGELRKLEEKGSLSKRLRLELGFSH
family domain is underlined, and RYVEAEQAAAGWPSWLTAVAGDAIQGLVPLKADSFEKLE
the serine/threonine protein KIGQGTYSSVFRARELANGRMVALKKVRFDNFQPESIQF
kinases active-site signature is MAREISILRRLDHPNIMKLEGIITSRMSNSIYLVFEYME
in bold HDLYGLISSPQVKFSDAQVKCYMKQLLSGIEHCHQHGVI
HRDVKSSNILVNNEGILRIGDFGLANILNPKDRQQLTSH
VVTLWYRPPELLMGSTSYGVTVDLWSVGCVFAELMFRKP
ILRGRTEVEQLHKIFKLCGSPPDGYWKMCKVPQATMFRP
RHAYECTLRERCKGIATSAMKLMETFLSIEPHKRGTASS
ALISEYFRTVPYACDPSSLPKYPPNKEIDAKHREEARRK
KARSRVREAEVGKRPTRIHRASQEQGFSSNIAPKEKRSYA
12 The amino acid sequence of SEQ ID MAVAAPGHLNVNESPSWGSRSVDCFEKLEQIGEGTYGQV
272. The conserved eukaryotic YMAKEKKTGEIVALKKIRMDNEREGFPITAIREIKILKK
protein kinase domain is LHHENVIKLKEIVTSPGPEKDEQGRPEGNKYKGGIYMVF
underlined and the protein kinases EYMDHDLTGLADRPGMRFSVPQIKCYMRQLLTGLHYCHI
ATP-binding region and NQVLHRDIKGSNLLIDNEGNLKLADFGLARSFSNDHNAN
serine/threonine protein kinases LTNRVITLWYRPPELLLGATKYGPAVDMWSVGCIFAELL
active-site signatures are in HGKPIFPGKDEPEQLNKIFELCGAPDEINWPGVSKIPWY
bold. NNFKPTRPMKRRLREVFRHFDRHALELLERMLTLDPSQR
ISAKDALDAEYFWADPLPCDPKSLPKYESSHEFQTKKKR
QQQRQHEETAKRQKLQHPPQHPRLPPVQQSGQAHAQMRP
GPNQLMHGSQPPVATGPPGHHYGKPRGPSGGAGRYPSSG
NPGGGYNHPSRGGQGGSGGYNSGPYPPQGRAPPYGSSGM
PGAGPRGGGGNNYGVGPSNYPQGGGGPYGGSGAGRGSNM
MGGNRNQQYGWQQ
13 The amino acid sequence of SEQ ID MGCICTKGILPAHYRIKDGGLKLSKSSKRSVGSLRRDEL
273. The conserved AVSANGGGNDAADRLISSPHEVENEVEDRKNVDFNEKLS
serine/threonine protein kinase KSLQRRATMDVASGGHTQAQLKVGKVGGFPLGERGAQVV
domain is underlined, and the AGWPSWLTAVAGEAINGWVPRRADSFEKLEKIGQGTYSS
serine/threonine protein kinase VYRARDLETNTIVALKKVRFANMDPESVRFMAREIIIMR
active-site signature is in bold. KLDHPNVMKLEGLITSRVSGSLYLVFEYMDHDLAGLAAT
PSIKLTESQIKCYMQQLLRGLEYCHSHGVLHRDIKGSNL
LVDNNGNLKIGDFGLATFFRTNQKQPLTSRVVTLWYRPP
ELLLGSSDYGASVDLWSSGCILAELFAGKPIMPGRTEVE
QLHKIFKLCGSPSEEYWKKSKLPHATIFKPQQPYKRCLL
ETFKDFPSSALGLLDVLLAVEPECRGTASSALQNEFFTS
NPLPSDPSSLPKYPSSKEFDARLRDEEARKHKATAGKAR
GLESIRKGSKESKVVPTSNANADLKASIQKRQEQSNPRS
TGEKPGGTTQNNFILSGQSAKPSLNGSTQIGNANEVEAL
IVPDRELDSPRGGAELRRQRSFMQRRASQLSRFSNSVAV
GGDSHLDCSREKGANTQWRDEGFVARCSHPDGGELAGKH
DWSHHLLHRPISLFKKGGEHSRRDSIASYSPKKGRIHYS
GPLLPSGDNLDEMLKEHERQIQNAVRKARLDKVKTKREY
ADHGQTESLLCWANGR
14 The amino acid sequence of SEQ ID MDPDPSPDPDPPKSWSIHTRREIIARYEILERVGSGAYS
274. The conserved protein kinase DVYRGRRLSDGLAVALKEVHDYQSAFREIEALQILRGSP
family domain is underlined and HVVLLHEYFWREDEDAVLVLEFLRSDLAAVIADASRRPR
the serine/threonine protein DGGGGGAAALRAGEVKRWMLQVLEGVDACHRNSIVHRDL
kinases active-site signature is KPGNLLISEEGVLKIADFGQARILLDDGNVAPDYEPESF
in bold. EERSSEQADILQQPETMEADTTCPEGQEQGAITREAYLR
EVDEFKAKNPRHEIDKETSIFDGDTSCLATCTTSDIGED
PFKGSYVYGAEEAGEDAQGCLTSCVGTRWFRAPELLYGS
TDYGLEVDLWSLGCIFAELLTLEPLFPGISDIDQLSRIF
NVLGNLSEEVWPGCTKLPDYRTISFCKIENPIGLESCLP
NCSSDEVSLVRRLLCYDPAARATPMELLQDKYFTEEPLP
VPISALQVPQSKNSHDEDSAGGWYDYNDMDSDSDFEDFG
PLKFTPTSTGFSIQFP
15 The amino acid sequence of SEQ ID MDPDPSPSPDPPKSWSIHTRREIIARYEILERVGSGAYS
275. The conserved DVYRGRRLSDGLAVALKEVHDYQSAFREIEALQILRGSP
serine/threonine protein kinase HVVLLHEYFWREDEDAVLVLEFLRSDLAAVIADASRRPR
domain is underlined, and the GGGVAPLRAGEGKRWMLQVLEGVDACHRNSIVHRDLKPG
serine/threonine protein kinase NLLISEEGVLKIADFGQARILLDDGNVAPDYEPESFEER
active-site signature is in bold. SSEQADILQQPETMEADTTCPEGQEQGAITREAYLREVD
EFKAKNPRHEIDKETSIYDGDTSCLATCTTSDIGEDPFK
GSYVYGAEEAGEDAQGSLTSCVGTRWFRAPELLYGSTDY
GLEVDLWSLGCIFAELLTLEPLFPGISDIDQLSRIFNVL
GNLSEEVWPGCTKLPDYRTISFCKIENPIGLESCLPNCS
SDEVSLVRRLLCYDPAARATPMELLQDKYFTEEPLPVPI
SALQVPQSKNSHDEDSAGGWYDYNDMDSDSDFEDFGPLK
FTPTSTGFSIQFP
16 The amino acid sequence of SEQ ID MSNQHRRSSFSSSTTSSLAKRHASSSSSSLENAGKAFAA
276. The conserved cyclin and AAVPSHLAKKRAPLGNLTNLKAGDGNSRSSSAPSTLVAN
cyclin C-terminal domains are ATKLAKTRKGSSTSSSIMGLSGSALPRYASTKPSGVLPS
underlined and the cyclins VNPSIPRIEIAVDPMSCSMVVSPSRSDMQSVSLDESMST
signature is in bold. CESFKSPDVEYIDNEDVSAVDSIDRKTFSNLYISDAAAK
TAVNICERDVLMEMETDEKIVNVDDNYSDPQLCATIACD
IYQHLRASEAKKRPSTDFMDRVQKDITASMRAILIDWLV
EVAEEYRLVPDTLYLTVNYIDRYLSGNVMNRQRLQLLGV
ACMMIAAKYEEICAPQVEEFCYITDNTYFKEEVLQMESS
VLNYLKFEMTAPTVKCFLRRFVRAAQGVNEVPSLQLECM
ANYIAELSLLEYDMLCYAPSLVAASAIFLAKFVITPSKR
PWDPTLQHYTLYQPSDLGNCVKDLHRLCFNNHGSTLPAI
REKYSQHKYKYVAKKYCPPSIPPEFFHNLVY
17 The amino acid sequence of SEQ ID MNKENAVGTKSEAPTIRITRSRSKALGTSTGMLPSSRPS
277. The conserved cyclin and FKQEQKRTVRANAKRSASDENKGTMVGNASKQHKKRTVL
cyclin C-terminal domains are NDVTNIFCENSYSNCLNAAKAQTSRQGRKWSMKKDRDVH
underlined. QSGAVQIMQEDVQAQFVEESSKIKVAESMEITIPDKWAK
RENSEHSISMKDTVAESSRKPQEFICGEKSAALVQPSIV
DIDSKLEDPQACTPYALDIYNYKRSTELERRPSTIYMET
LQKDVTPNMRGILVDWLVEVSEEYKLVPDTLYLTVNLID
RSLSQKFIEKQRLQLLGVTCMLIASKYEEICPPRVEEFC
FITDNTYTSLEVLKMESRVLNLLHFQLSVPTVKTFLRRF
VQAAQVSSEVPSVELEYLANYLAELTLVEYSFLKFLPSL
MAASAVLLARWTLNQSDNPWNLTLEHYTKYKASELKAAV
LALEDLQLNTSGSTLNAIREKYRQQKVNYSLLIHSKANH
EIL
18 The amino acid sequence of SEQ ID MAGSDENNPGVVGGAHVQEGLRVGAGKMGAGNVQQRRAL
278. The conserved cyclin N- and SNINSNIIGAPPYPCAVNKRVLSEKNVNSENDLLNAAHR
C-terminal family domains are PITRQFAAQMAYKQQLRPEENKRTTQSVSNPSKSEDCAI
underlined and the cyclins LDVDDDKMADDFPVPMFVQHTEAMLEEIDRMEEVEMEDV
signature is in bold. AEEPVTDIDSGDKENQLAVVEYIDDLYMFYQKAEASSCV
PPNYMDRQQDINERMRGILIDWLIEVHYKFELMDETLYL
TVNLIDRFLAVQPVVKKKLQLVGVTAMLLACKYEEVSVP
VVEDLILISDRAYSRKEVLEMERLMVNTLHFNMSVPTPY
VFMRRFLKAAQSDKKLELLSFFIIELSLVEYDMLKFPPS
LLAASAIYTALSTITRTKQWSTTCEWHTSYSEEQLLECA
RLMVTFHQRAGSGKLTGVHRKYSTSKFGHAARTEPANFL
LDFRL
19 The amino acid sequence of SEQ ID MASRPIVPVQARGEAAIGGGAGKAAIGGGAGKQQKKNGA
279. The conserved cyclin and AEGRNRKALGDIGNLVTVRGIEGKVQPHRPITRSFCAQL
cyclin C-terminal domains are LANAQAAAAAENNKKQAVVNVNGAPSILDVPGAGKRAEP
underlined. AAAAAAAVAKAAQKKVVKPKQKAEVIDLTSDSEERSRPR
RSNNIMSLRRRKERNHREGICPLSLRSSLLEARLVDWLI
EIHNKFDLMPETLYLTINIIDRFLSVKAVPRRELQLLGM
GALFTASKYEEIWAPEVNDLVCIADRAYSHEQVLAMEKT
ILGKLEWTLTVPTHYVFLVRFIKASLGDRKLENMVYFLA
ELGVMNYATLTYCPSMVAASAVYAARCTLGLTPLWNDTL
KLHTGFSESQLMDCARLLVGYHAKAKENKLQVVYKKYSS
SQREGVALIPPAKALLCEGGGLSSSSSLASSS
20 The amino acid sequence of SEQ ID MGLPDENNAALSKPTNLQVGGLEIGGRKFGQEIRQTRRA
280. The conserved cyclin and LSVINQNLVGDRAYPCHVVNKRGHSKRDAVCGKDQVDPV
cyclin C-terminal domains are HRPLTRKFAAQTASTQQHCIEEAKKPRTAVQERNEFGDC
underlined and the cyclins IFVDVEDCQPSSENQPVPMFLEIPESRLDDDMEEVEMED
signature is in bold. IVEEEEEEPIMDIDGRDKKNPLAVVDYIEDIYANYRRTE
NCSCVSANYMAQQADINEKMRSILIDWLIEVHDKFDLMH
ETLFLTVNLIDRFLARQSVVRKKLQLVGLVAMLLACKYE
EVSVPVVGDLILISDKAYTRKEVLEMESLMLNSLQFNMS
VPTPYVFMRRFLKAAESDKKLEVLSFFLIELSLVEYEMV
KFPPSLLAAAAIFTAQCTLYGFKQWTKTCEWHSNYTEDQ
LLECARMMVGFHQKAATGKLTGVHRKYGTSKFGYTSKCE
PANFLLGEMKNP
21 The amino acid sequence of SEQ ID MGLPDENNAALSKPTNLQVGGLEIGGRKFGQEIRQTRRA
281. The conserved cyclin and LSVINQNLVGDRAYPCHVVNKRGHSKRDAVCGKDQVDPV
cyclin C-terminal domains are HRPLTRKFAAQTASTQQHCIEEAKKPRTAVQERNEFGDC
underlined and the cyclins IFVDVEDCQPSSENQPVPMFLEIPESRLDDDMEEVEMED
signature is in bold. IVEEEEEEPIMDIDGRDKKNPLAVVDYIEDIYANYRRTE
NCSCVSANYMAQQADINEKMRSILIDWLIEVHDKFDLMH
ETLFLTVNLIDRFLARQSVVRKKLQLVGLVAMLLACKYE
EVSVPVVGDLILISDKAYTRKEVLEMEKLMLNSLQFNMS
VPTPYVFMRRFLKAAESDKKLEVLSFFLIELSLVEYEMV
KFPPSLLAAAAIFTAQCTLYGFKQWTKTCEWHSNYTEDQ
LLECARMMVGFHQKAATGKLTGVHRKYGTSKFGYTSKCE
AANFLLGEMKNP
22 The amino acid sequence of SEQ ID MAMVQRQGHDPSSPQEQEDGPSSFLSDDALYCEEGRFEE
282. The conserved cyclin N- and DDGGGGGQVDGIPLFPSQPADRQQDSPWADEDGEEKEEE
C-terminal family domains are EAELQSLFSKERGARPELAKDDGGAVAARREAVEWMLMV
underlined. RGVYGFSALTAVLAVDYLDRFLAGFRLQRDNRPWMTQLV
AVACLALAAKVEETDVPLLVELQEVGDARYVFEAKTVQR
MELLVLSTLGWEMHPVTPLSFVHHVARRLGASPHHGEFT
HWAFLRRCERLLVAAVSDARSLKHLPSVLAAAAMLRVIE
EVEPFRSSEYKAQLLSALHMSQEMVEDCCRFILGIAETA
GDAVTSSLDSFLKRKRRCGHLSPRSPSGVIDASFSCDDE
SNDSWATDPPSDPDDNDDLNPLPKKSRSSSPSSSPSSVP
DKVLDLPFMNRIFEGIVNGSPI
23 The amino acid sequence of SEQ ID MEASYQPHHHGHLRQHDPSSSQQEEQVPFDALYCSEEHW
283. The conserved cyclin and GEEDEEEGLASDGLLSEERDHRLLSPRALLDQDLLWEDE
cyclin C-terminal domains are ELASLFSKEEPGGMRLNLENDPSLADARREAVEWIMRVH
underlined. AHYAFSALTALLAVNYWDRFTCSFALQEDKPWMTQLSAV
ACLSLAAKVEETQVPLLIDFQVEDSSPVFEAKNIQRMEL
LVLSSLEWKMNPVTPLSFLDYMTRRLGLTGHLCWEFLRR
CENVLLSVISDCRFTCYLPSVIAASTMLHVINGLKPRLD
VEDQTQLLGILAMGMDKIDACYKLIDDDHALRSQRYSHN
KRKFGSVPGSPRGVMELCFSSDGSNDSWSVAASVSSSPE
PHSKKSRAGEEAEDRLLRGLEGEEDDPASADIFSFPH
24 The amino acid sequence of SEQ ID MALQEEDTRRHYPTAPPFSPDGLYCEDETFGEDLADNAC
284. The conserved cyclin and EYAGGGARDGLCEIKDPTLPPSLLGQDLFWEDGELASLV
cyclin C-terminal domains are SRETGTHPCWDELISDGSVALARKDAVGWILRVHGHYGF
underlined. RPLTAMLAVNYLDRFFLSRSYQRDRPWISQLVAVACLSV
AAKVEETQVPILLDLQVANAKFVFESRTIQRMELLLMST
LDWRMNSVTPISFFDHILRRFGLTTNLHRQFFWMCERLL
LSVVADVRLASFLPSVVATAAMLYVNKEIEPCICSEFLD
QLLSLLKINEDRVNECYELILELSIDHPEILNYKHKRKR
GSVPSSPSGVIDTSFSCDSSNDSWGVASSVSSSLEPRFK
RSRFQDQQMGLPSVNVSSMGVLNSSY
25 The amino acid sequence of SEQ ID MGQIQYSEKYFDDTYEYRHVVLPPDVAKLLPKNRLLSEN
285. The conserved cyclin- EWRAIGVQQSRGWVHYAIHRPEPHIMLFRRPLNYQQQQE
dependent kinases regulatory NQAQQNMLAK
subunit domain is underlined and
the cyclin-dependent kinases
regulatory subunits signature 1 is
in bold.
26 The amino acid sequence of SEQ ID MGSIDPPKAEQNGTAAAAVADPGQKPGAGDAMPPPPPVK
286. The conserved chromo domain HSNGTAAEPDVATKRRRMSVLPLEVGTRVMCRWRDGKYH
is underlined and the MOZ/SAS-like PVKVIERRKLNPGDPNDYEYYVHYTEFNRRLDEWVKLEQ
protein domain is in bold/italics. LDLNSVETVVDEKVEDKVTGLKMTRHQKRKIDETHVEGH
EELDAASLREHEEFTKVKNIATIELGRYEIETWYFSPFP
PEYNDCSKLYFCEFCLNFMKRKEQLQRHMKKCD
PKVLDRHLKAAGRG
GLEVDVSKLIWTPYREQG
27 The amino acid sequence of SEQ ID MDTGGNSLPSGPDGVKRKVCYFYDPEVGNYYLLQHMQVL
292. The conserved histone KPVPARDRDLCRFHADDYVAFLRSITPETQQDQLRQLKR
deacetylase family domain is FNVGEDCPVFDGLHSFCQTYAGGSVGGAVKLNHGLCDIA
underlined. INWAGGLHHAKKCEASGFCYVNDIVLGILELLKQHERVL
YVDIDIHHGDGVEEAFYTTDRVMTVSFHKFGDYFPGTGD
IRDIGYGKGKYYSLNVPLDDGIDDESYHSLFKPIIGKVM
EVFKPGAVVLQCGADSLSGDRLGCFNLSIKGHAECVRYM
RSFNVPVLLLGGGGYTIRNVARCWCYETGVALGLEVDDK
MPQHEYYEYFGPDYTLHVAPSNMENKNSRQLLEEIRSKL
LENLSKLQHAPSVPFQERPPDTELPEADEDQEDPDERWD
PDSDMDVDEDRKPLPSRVKRELIVEPEVKDQDSQKASID
HGRGLDTTQEDNASIKVSDMNSMITDEQSVKMEQDNVNK
PSEQIFPK
28 The amino acid sequence of SEQ ID MDTGGNSLPSGPDGVKRKVCYFYDPEVGNYYYGQGHPMK
293. The conserved histone PHRIRMTHALLAHYGLLQHMQVLKPVPARDRDLCRFHAD
deacetylase family domain is DYVAFLRSITPETQQDQLRQLKRFNVGEDCPVFDGLHSF
underlined. CQTYAGGSVGGAVKLNHGLCDIAINWAGGLHHAKKCEAS
GFCYVNDIVLGILELLKQHERVLYVDIDIHHGDGVEEAF
YTTDRVMTVSFHKFGDYFPGTGDIRDIGYGKGKYYSLNV
PLDDGIDDESYHSLFKPIIGKVMEVFKPGAVVLQCGADS
LSGDRLGCFNLSIKGHAECVRYMRSFNVPVLLLGGGGYT
IRNVARCWCYETGVALGLEVDDKMPQHEYYEYFGPDYTL
HVAPSNMENKNSRQLLEDIRSKLLENLSKLQHAPSVPFQ
ERPPDTELPEADEDQEDPDERWDPDSDMDVDEDRKPLPS
RVKRELIVEPEVKDQDSQKASIDHGRGLDTTQEDNASIK
VSDMNSMITDEQSVKMEQDNVNKPSEQIFPK
29 The amino acid sequence of SEQ ID MRPKDRISYFYDGDVGSVYFGPNHPMKPHRLCMTHHLVL
294. The conserved histone SYELHTKMEIYRPHKAYPAELAQFHSPDYVEFLHRITPD
deacetylase domain is underlined. TQHLFPNDLAKYNLGEDCPVFENLFEFCQIYAGGTIDAA
RRLNNQLCDIAINWAGGLHHAKKCEASGFCYINDLVLGI
LELLKYHARVLYIDIDVHHGDGVEEAFYFTDRVMTVSFH
KFGDMFFPGTGDVKEIGGKEGKFYAINVPLKDGIDDTSF
TRLFKAIISKVVETYQPGAIVLQCGADSLAGDRLGCFNL
SIDGHSECVRFVKKFNLPLLVTGGGGYTKENVARCWVVE
TGVLLDTELPNEIPENEYFKYFAPDYSLKIPRGNIVLEN
LNSKSYLSAIKVQVLENLRNIQHAPSVQMQEVPPDFYIP
DFDEDEQNPDERMDQHTQDKQIQRDDEYYDGDNDNDHNM
DD
30 The amino acid sequence of SEQ ID MTVAEDFHVNNRSKMVSQATPESRLTGGEDDNSLHNQVD
295. The conserved histone ELLCQELPERQVILEFEGTRPKPYFSDHNGGENSALGVR
deacetylase family domain is ATEDDLNSDVEAEEKQKEMTLEDMYKNDGTLYDDDEDDS
underlined and the Zinc finger DWEPVKRQVELMRWFCTNCTMVNVEDVFLCDICGEHRDS
RanBP2-type profile is in bold. GILRHGFYASPFMQDVGAPSVEAEVQESREDHARSSPPS
SSTVVGFDEKMLLHSEVEMKSHPHPERADRLQAIAASLA
TAGIFPGRCRSLPVREITKEELQMVHSSEHVDAVEMTSH
MFSSYFTPDTYANEHSARAARIAAGLCADLASTIISGRS
KNGFALVRPPGHHAGIKHAMGFCLHNNAAVAALAAQGAG
AKKVLIVDWDVHHGNGTQEIFDGNKSVLYISLHRHEGGN
FYPGTGAAHEVGTMGAEGYCVNIPWSRRGVGDNDYVFAF
HHIVLPIASAFAPDFTIISAGFDAARGDPLGCCDVTPAG
YAQMTHMLSALSGGKLLVILEGGYNLRSISSSAVAVIKV
LLGDSPISEIADAVPSKAGLRTVLEVLKIQRSYWPSLES
IFWELQSQWGMFLVDNRRKQIRKRRRVLVPIWWKWGRKS
VLYHLLNGHLHVKTKR
31 The amino acid sequence of SEQ ID MAAAPSSPPTNRVDVFWHDGMLSHDTGRGVFDTGSDPGF
296. The conserved histone LDVLEKHPENPDRVRNMVSILKRGPISPFISWHTATPAL
deacetylase family domain is ISQLLSFHSPEYINELVEADKNGGKVLCAGTFLNPGSWD
underlined. AALLAAGNTLSAMKYVLDGKGKIAYALVRPPGHHAQPSQ
ADGYCFLNNAGLAVRLALDSGCKRVVVVDIDVHYGNGTA
EGFYQSSDVLTISLHMNHGSWGPSHPQSGSVDELGEDEG
YGYNMNIPLPNGTGDRGYEYAVTELVVPAVESFKPEMVV
LVVGQDSSAFDPNGRQCLTMDGYRAIGRTIRGLADRHSG
GRILIVQEGGYHVTYSAYCLHATVEGILDLPDPLLADPI
AYYPEDEAFPVKVVDSIKRYLVDKVPFLKEH
32 The amino acid sequence of SEQ ID MVESSGGASLPSVGQDARKRRVSYFYEPTIGDYYYGQGH
297. The conserved histone PMKPHRIRMAHNLIVHYYLHRRMEISRPFPAATTDIRRF
deacetylase family domain is HSEDYVTFISSVTPETVSDPAFSRQLKRFNVGEDCPVFD
underlined. GIFGFCQASAGGSMGAAVKLNRGDSDIALNWAGGLHHAK
KSEASGFCYVNDIVLGILELLKVHKRVLYVDIDVHHGDG
VEEAFYTTDRVMTVSFHKFGDFFPGSGHIKDTGAGPGKN
YALNVPLNDGIDDESFRGMFRPIIQKVMEVYQPDAVVLQ
CGADSLSGDRLGCFNLSVKGHADCLRFLRSFNVPLMVLG
GGGYTMRNVARCWCYETAVAVGVEPENDLPYNEYYEYFG
PDYTLHVEPCSMENLNAPKDLERIRNMLLEQLSRIPHAP
SVPFQMTPPITQEPEEAEEDMDERPKPRIWNGEDYESDA
EEDKSQHRSSNADALHDENVEMRDSVGENSGDKTREDRS
PS
33 The amino acid sequence of SEQ ID MAAIISCHHYHSCCSSLIASKWVGARIPTSCFGRSSTQS
299. The conserved cyclophilin- NNAASVRQFVTRCSSSPSSRGQWQPHQNGEKGRSFSLRE
type peptidyl-prolyl cis-trans CAISIALAVGLVTGVPSLDMSTGNAYAASPALPDLSVLI
isomerase family domain is SGPPIKDPEALLRYALPINNKAIREVQKPLEDITDSLKV
underlined. AGLRALDSVERNVRQASRVLKQGKNLIVSGLAESKKDHG
VELLDKLEAGMDELQQIVEDGNRDAVAGKQRELLNYVGG
VEEDMVDGFPYEVPEEYKNMPLLKGRAAVDMKVKVKDNP
NLEECVFRIVLDGYNAPVTAGNFVDLVERHFYDGMEIQR
ADGFVVQTGDPEGPAESFIDPSTEKPRTIPLEIMVDGEK
APVYGATLEELGLYKAQTKLPFNAFGTMAMARDEFEDNS
ASSQIFWLLKESELTPSNANILDGRYAVFGYVTENQDFL
ADLKVGDVIESVQVVSGLDNLANPSYKIAG
34 The amino acid sequence of SEQ ID MAGEDFDIPPADEMNEDFDLPDDDDDAPVMKAGDEKEIG
300. The conserved FKBP-type KQGLKKKLVKEGDAWETPDNGDEVEVHYTGTLLDGTQFD
peptidylprolyl isomerase domains SSRDRGTPFKFTLGQGQ
are underlined. The FKBP-type EAGSPPTIPPNATLQFDVELLSWTSVKDICKD
peptidyl-prolyl cis-trans GGIFKKILVEGEKWENPKDLDEVLVKYEFQLEDGTTIAR
isomerase signature 1 is in bold SDGVEFTVKEGHFCPAVAKAVKTMKKGEKVLLTVKPQYG
and the FKBP-type peptidyl-prolyl FGEKGKPASGDEGAVPPNATLQITLELVSWKTVSEVTDD
cis-trans isomerase signature 2 is KKVIKKILKEGEGYERPNEGAVVEVKLIGKLQDGTVFVK
in bold/italics. KGHDDCEELFKFKIDEEQ
SSESKQDLAVVPPSSTVYYEVELVSFVKDKE
SWDMNTEEKIEAAGKKKEEGNVIFKAGKYAKASKRYEKA
VKYIEYDTSFSEDEKKQAKALKVACNLNDAACKLKLKDY
NQAEKLCTKVLELDSRNVKALYRRAQAYIELSDLDLAEF
DIKKALEIDPHNRDVKLEYKVLKEKVKEFNKKDAKFYGN
MFAKMSKLEPVEKTAAKEPEPMSIDSKA
35 The amino acid sequence of SEQ ID MSTVYVLEPPTKGKVVLNTTHGPLDVELWPKEAPKAVRN
301. The conserved cyclophilin- FVQLCLEGYYDNTIFHRIIKDFLVQGGDPTGSGTGGESI
type peptidyl-prolyl cis-trans YGDAFSDEFHSRLRFKHRGLVACANAGSPHSNGSQFFIT
isomerase family domain is LDRCDWLDRKNTIFGKITGDSIYNLSGLAEVETDKSDRP
underlined and the cyclophilin- LDPPPKIISVEVLWNPFEDIVPRAPVRSLVPTVPDVQNK
type peptidyl-prolyl cis-trans EPKKKAVKKLNLLSFGEEAEEEEKALVVVKQKIKSSHDV
isomerase signature is in bold. LDDPRLLKEHIPSKQVDSYDSKTARDVQSVREALSSKKQ
ELQKESGAEFSNSFREIADDEDDDDDDASFDARMRRQIL
QKRKELGDLPPKPKPKSRDGISARKERETSISRDKDDDD
DDDQPRVEKLSLKKKGIGSEARGERMANADADLQLLNDA
ERGRQLQKQKKHRLRGREDEVLTKLETFKASVFGKPLAS
SAKVGDGDGDLSDWRSVKLKFAPEPGKDRMTRNEDPNDY
VVVDPLLEKGKEKFNRMQAKEKRRGREWAGKSLT
36 The amino acid sequence of SEQ ID MASAISMHSSGLLLLQGTNGKDVTEMGKAPASSRVANMQ
302. The conserved cyclophilin- QRKYGATCCVARGLTSRSHYASSLAFKQFSKTPSIKYDR
type peptidyl-prolyl cis-trans MVEIKAMATDLGLQAKVTNKCFFDVEIGGEPAGRIVIGL
isomerase family domain is FGDDVPKTVENFRALCTGEKGFGYKGCSFHRIIKDFMIQ
underlined and the cyclophilin- GGDFTRGNGTGGKSIYGSTFEDENFALKHVGPGVLSMAN
type peptidyl-prolyl cis-trans AGPSTNGSQFFICTVKTPWLDNRHVVFGQVVDGMDVVQK
isomerase signature is in bold. LESQETSRSDVPRQPCRIVNCGELPLDG
37 The amino acid sequence of SEQ ID MAASFTALSNVGSLSSPRNGSEIRRFRPSCNVAASVRPP
303. The conserved cyclophilin- PLKAGLSASSSSSFSGSLRLIPLSSSPQRKSRPCSVRAS
type peptidyl-prolyl cis-trans AEAAAAQSKVTNKVYLDISIGNPVGKLVGRIVIGLYGDD
isomerase signature is underlined. VPQTAENFRALCTGEKGFGYKGSTVHRVIKDFMIQGGDF
DKGNGTGGKSIYGRTFKDENFKLSHVGPGVVSMANAGPN
TNGSQFFICTVKTPWLDQRHVVFGQVLEGMDIVRLIESQ
ETDRGDRPRKRVVVSDCGELPVV
38 The amino acid sequence of SEQ ID MAEAIDLTGDGGVMKTIVRRAKPDAVSPSETLPLVDVRY
304. The conserved FKBP-type EGVLAETGEVFDSTHEDNTLFSFEIGKGSVISAWDTALR
peptidyl-prolyl cis-trans TMKVGEVAKITCKPEYAYGSTGSPPDIPPDATLIFEVEL
isomerase signature is underlined VACKPCKGFSVTSVTEDKARLEELKKQREIAAATKEEEK
and the FKBP-type peptidyl-prolyl KRREEAKAAAAARVQAKLDAKKGHGKGKGKAK
cis-trans isomerase signature 2 is
in bold.
39 The amino acid sequence of SEQ ID MGNPKVFFDMSIGGQPAGRIVMELYADVVPRTAENFRAL
305. The conserved cyclophilin- CTGEKGAGRSGKPLHYKGSSFHRVIPGFMCQGGDFTAGN
type peptidyl-prolyl cis-trans GTGGESIYGSKFADENFVKKHTGPGVLSMANAGPGTNGS
isomerase family domain is 1QFFVCTAKTEWLDGKHVVFGQIVDGMDVVKAIEKVGSSS
underlined and the cyclophilin- GRTSKPVVVADCGQLS
type peptidyl-prolyl cis-trans
isomerase signature is in bold.
40 The amino acid sequence of SEQ ID MPNPKVFFDMTIGGAAAGRVVMELYADTTPRTAENFRAL
306. The conserved cyclophilin- CTGEKGVGRSKKPLHYKGSKFHRVIPSFMCQGGDFTAGN
type peptidyl-prolyl cis-trans GTGGESIYGVKFADENFIKKHTGPGILSMANAGPGTNGS
isomerase signature is underlined QFFICTTKTEWLDGKHVVFGKVVEGMEVVKAIEKVGSSS
and the cyclophilin-type peptidyl- GRTSKPVVVADCGQLP
prolyl cis-trans isomerase
signature is in bold.
41 The amino acid sequence of SEQ ID MAEAIDLTGDGGVMKTIVRRAKPDAVSPSETLPLVDVRY
307. The conserved FKBP-type EGVLAETGEVFDSTHEDNTLFSFEIGKGSVISAWDTALR
peptidyl-prolyl cis-trans TMKVGEVAKITCKPEYAYGSTGSPPDIPPDATLIFEVEL
isomerase signature is underlined VACKPCKGFSVTSVTEDKARLEELKKQREIAAATKEEEK
and the FKBP-type peptidyl-prolyl KRREEAKAAAAARVQAKLDAKKGHGKGKGKAK
cis-trans isomerase signature 2 is
in bold.
42 The amino acid sequence of SEQ ID MATARSFFLCALLLLATLYLAQAKKSEDLKEVTHKVYFD
308. The conserved cyclophilin- VEIAGKPAGRIVMGLYGKAVPKTAENFRALCTGEKGTGK
type peptidyl-prolyl cis-trans SGKPLHYKGSSFHRIIPSFMLQGGDFTLGDGRGGESIYG
isomerase signature is underlined EKFADENFKLKHTGPGLLSMANAGPDTNGSQFFITTVTT
and the cyclophilin-type peptidyl- SWLDGRHVVFGKVLSGMDVVYKVEAEGRQSGTPKSKVVI
prolyl cis-trans isomerase ADSGELPL
signature is in bold.
43 The amino acid sequence of SEQ ID MMRREISVLLQPRFVLAFLALAVLLLVFAFPFSRQRGDQ
309. The conserved cyclophilin- VEEEPEITHRVYLDVDIDGQHLGRIVIGLYGEVVPRTVE
type peptidyl-prolyl cis-trans NFRALCTGEKGKSANGKKLHYKGTPFHRIISGFMIQGGD
isomerase family domain is VIYGDGKGYESIYGGTFADENFRIKHSHAGIISMVNSGP
underlined and the cyclophilin- DSNGSQFFITTVKASWLDGEHVVFGRVIQGMDTVYAIEG
type peptidyl-prolyl cis-trans GAGTYNGKPRKKVIIADSGEIPKSKWDEER
isomerase signature is in bold.
44 The amino acid sequence of SEQ ID MWATAEGGPPEVTLETSMGSFTVELYFKHAPRTSRNFIE
310. The conserved cyclophilin- LSRRGYYDNVKFHRIIKDFIVQGGDPTGTGRGGESIYGK
type peptidyl-prolyl cis-trans KFEDEIKPELKHTGAGILSMANAGPNTNGSQFFITLAPC
isomerase family domain is PSLDGKHTIFGRVCRGMEIIKRLGSVQTDNNDRPIHDVK
underlined and the cyclophilin- ILRTSVKD
type peptidyl-prolyl cis-trans
isomerase signature is in bold.
45 The amino acid sequence of SEQ ID MSNPKVFFDILIGKMKAGRVVMELFADVTPKTAENFRAL
311. The conserved cyclophilin- CTGEKGIGRSGKPLHYKGSTFHRIIPNFMCQGGDFTRGN
type peptidyl-prolyl cis-trans GTGGESIYGMKFADENFKIKHTGLGVLSMANAGPDTNGS
isomerase family domain is QFFICTEKTPWLDGKHVVFGKVIDGYNVVKEMESVGSDS
underlined and the cyclophilin- GSTRETVAIEDCGQLSEN
type peptidyl-prolyl cis-trans
isomerase signature is in bold.
46 The amino acid sequence of SEQ ID MDDDFEFPASSNVENDDDDGMDMDDMGGDVPEEEDPVAS
312. The conserved FKBP-type PAVLKVGEEREIGKAGFKKKLVKEGEGWETPSSGDEVEV
peptidylprolyl isomerase domains HYTGTLLDGTKFDSSRDRGTPFKFKLGRGQ
are underlined. The FKBP-type ESGSPPTIPPNATLQFDVE
peptidyl-prolyl cis-trans LLSWSSVKDICKDGGILKKVLVEGEKWDNPKDLDEVFVK
isomerase signature 1 is in bold YEASLEDGTLISKSDGVEFTVGDGYFCAALAKAVKTMKK
and the FKBP-type peptidyl-prolyl GEKVLLTVMPQYAFGETGRPASGDEAAVPPDASLQIMLE
cis-trans isomerase signature 2 is LVSWKTVSDVTKDKKVLKKTLKEGEGYERPNDGAAVQVR
in bold/italics. The TPR repeat LCGKLQDGTVFVKKDDEEPFEFKIDEEQ
is in italics. PTESQQDLAVVPANSTVYYEV
ELLSFVKEKESWEMNNQEKIEAAARKKEEGNAAFKAGKY
VRASKRYEKAVRFIEYDSSFSDEEKQQAKTLKNTCNLND
AACKLKLKDFKEAEKLCTKVLEGDGKNVKALYRRAQAYI
QLVDLDLAEQDIKKALEIDPNNRDVKLEYKILKEKVREY
NKRDAQFYGNMFAKMNKLEHSRTAGMGAKHEAAPMTIDS
KA
47 The amino acid sequence of SEQ ID MAKPRCFMDISIGGELEGRIVGELYTDVAPKTAENFRAL
313. The conserved cyclophilin- CTGEKGIGPHTGAPLHYKGVRFHRVIKGFMVQGGDISAG
type peptidyl-prolyl cis-trans DGTGGESIYGLKFEDENFDLKHERKGMLSMANSGPNTNG
isomerase family domain is SQFFITTTRTSHLDGKHVVFGRVVKGMGVVRSVEHVTTA
underlined and the cyclophilin- AGDCPTVDVVIADCGEIPAGADDGIRNFFKDGDTYPDWP
type peptidyl-prolyl cis-trans ADLDESPAELSWWMDAVDSIKAFGNGSYKKQDYKMALRK
isomerase signature is in bold. YRKALRYLDICWEKEGIDEVESSSLRKTKSQIFTNSSAC
The TPR repeat is in bold/italics. KLKLCDLKGALLDAEFAVRDGENN
GIKKELNAAKKKIFERREQ
EKRAYRKMFL
48 The amino acid sequence of SEQ ID MTKRKNPLVFLDVSIDGDPVERIVIELFADTVPRTAENF
314. The conserved cyclophilin- RSLCTGEKGVGKTTGKPLHYKGSYFHRIIKGFMAQGGDF
type peptidyl-prolyl cis-trans SNGNGTGGESIYGGKFADENFKLAHDGPGLLSMANGGPN
isomerase signature is underlined TNGSQFFIIFKRQPHLDGKHVVFGKVMRGMEVVKKIEQV
and the cyclophilin-type peptidyl- GSANGKPLQPVKIVDCGETSETGTQDAVVEEKSKSATLK
prolyl cis-trans isomerase AKKKRSARDSSSESRGKRRQRKSRKERTRKRRRYSSSDS
signature is in bold. YSSESSDSDSESYSSDTESESKSHSESSVSDSSSSDGRR
RKRKSTKREKLRRQRGKDSRGEQKSARYDKKSRHKSADS
SSDSESESSSRSRSRDDKKKSSRRESARSVSKLKDAEAN
SPENLESPRDREIKKVEDNSSHEEGEFSPKNDVQHNGHG
TDAKFGKYDDQRPRSDGSKKSSGSMRDSPKRLANSVPQG
SPSSSPAHKASEPSSSIRARNPSRSPAPDGNSKRIRKGR
GFTERFSYARRYRTPSPEDVTYRPYHYGRRNFHDRRNDR
YSNYRSYSERSPHRRYRSPPRGRSPPRYQRRRSRSRSVS
RSPGGNKGRYRGRDQSRSRSRSRSRSPRRGSSPANKQLP
LSERLKSRLGTRVDEHSPRRRRSSSRSHDSSRSRSPDEV
PDKHEGKAAPVSPARSRSSSPSGRGLVSYGDASPDSGIN
49 The amino acid sequence of SEQ ID MSVLLVTSLGDIVVDLHADRCPLTCKNFLKLCRIKYYNG
315. The conserved cyclophilin- CVFHTVQKDFTAQTGDPTGTGTGGDSVYKFLYGDQARFF
type peptidyl-prolyl cis-trans MDEIHLDLKHSKTGTVAMASGGENLNASQFYFTLRDDLD
isomerase signature is underlined. YLDGKHTVFGEVAEGLETLTRINEAYVDEKGRPYKNIRI
The CCHC type zinc finger is in RHTYILDDPFDDPPQLAELIPDASPEGKPKDEVVDDVRL
bold and the RNA-binding region EDDWVPLDEQLGPAQLEEAIRAKEAHSRAVVLESIGDIP
RNP-1 (RNA recognition motif) is DAEIKPPDNV
in bold/italics.
DFSQSVAKLWSQFKRKDSQAAKGKGCFKCGAPDHM
ARECPGSSTRQPLSKYILKEDNAQRGGDDSRYEMVFDED
APESPSHGKKRRGRDDRDDRHKMSRQSVEETKFNDREGG
HSVDKHRQSERSKHREDEMSRDSKASEAGRRRIDRDFPE
EERDGEKYTESHRDRDGKRGDYRDYRKGRADVQTHGDRR
GDENYRRKSAAYDDGHEGAGAARRKDSNDDHHAYRRGYG
DSRKGTRDEDDDGRGRRDDPSYRRSSGHKDSSNGGREEQ
KYRSGETDGKSHPERSHRGDRRR
50 The amino acid sequence of SEQ ID MRPFNGGSSIACLVLVIAAGALAESQGPHLGSARVVFQT
316. The conserved cyclophilin- NYGDIEFGFFPGVAPRTVDHIFKLVRLGCYNTNHFFRVD
type peptidyl-prolyl cis-trans KGFVAQVADVANGRTAPMNDEQRTEAEKTIVGEFSNVKH
isomerase signature is underlined. VRGILSMGRYDDPDSAQSSFSILLGDAPHLDGKYAIFGR
VTKGDETLKKLEQLPTRREGMFVMPTERITILSSYYYDT
GAESCEEENSTLRRRLAASAVEVERQRMKCFP
51 The amino acid sequence of SEQ ID MPNPKVFFDMQVGGAPAGRIVMELYADVVPKTAENFRAL
317. The conserved cyclophilin- CTGEKGTGRSGKPLHFKGSSFHRVIPGFMCQGGDFTRGN
type peptidyl-prolyl cis-trans GTGGESIYGEKFADENFVKKHTGPGILSMANAGPNTNGS
isomerase signature is underlined QFFICTAQTSWLDGKHVVFGQVVEGLEVVRDIEKVGSGS
and the cyclophilin-type peptidyl- GRTSKPVVIADSGQLA
prolyl cis-trans isomerase
signature is in bold.
52 The amino acid sequence of SEQ ID MRFTSITSAIALFAAAASALDKPLDIKVDKAVECSRKTK
318. The conserved FKBP-type AGDKIQVHYRGTLEADGSEFDASYKRGQPLSFHVGKGQV
peptidyl-prolyl cis-trans IKGWDQGLLDMCPGEKRTLTIQPDWGYGSRGMGPIPANS
isomerase signature is underlined VLIFETELVEIAGVAREEL
and the FKBP-type peptidyl-prolyl
cis-trans isomerase signature 2 is
in bold.
53 The amino acid sequence of SEQ ID MGNPKVFFDMSIGGQPAGRIVMELYADVVPRTAENFRAL
319. The conserved cyclophilin- CTGEKGAGRSGKPLHYKGSSFHRVIPGFMCQGGDFTAGN
type peptidyl-prolyl cis-trans GTGGESIYGSKFADENFVKKHTGPGVLSMANAGPGTNGS
isomerase signature is underlined QFFVCTAKTEWLDGKHVVFGQIVDGMDVVKAIEKVGSSS
andThe cyclophilin-type peptidyl- GRTSKPVVVADCGQLS
prolyl cis-trans isomerase
signature 2 is in bold.
54 The amino acid sequence of SEQ ID MAVATRSRWVAMSVAWILVLFGTLALIQNRLSDTGASSD
320. The conserved FKBP-type PKLVHRKVGEEKKKPDDLEEVTHKVFFDVEIGGKPAGRI
peptidyl-prolyl cis-trans VMGLFGKTVPKTVENFRALCTGEKGIGKSGKPLNYKGSQ
isomerase signature is underlined FHRIIPKFMIQGGDFTLGDGRGGESIYGNKFSDENFKLK
and the Cyclophilin-type peptidyl- HTDAGRLSMTNAGPDTNGSQFFITTVTTSWLDGRHVVFG
prolyl cis-trans isomerase KVLSGMDVVHKIEAEGGQSGQPKSIVVISDSGELDL
signature is in bold.
55 The amino acid sequence of SEQ ID MAVTLHTNLGDIKCEIFCDEVPKAAEHNARGILSMANSG
321. The conserved cyclophilin- PNTNGSQFFIAYAKQPHLNGLYTIFGRVIHGFEVLDIME
type peptidyl-prolyl cis-trans KTQTGPGDRPLAEIRLNRVTIHANPLAG
isomerase signature is underlined
56 The amino acid sequence of SEQ ID MAVATRSRWVAMSVAWILVLFGTLALIQNRLSDTGASSD
322. The conserved FKBP-type PKLVHRKVGEEKKKPDDLEEVTHKVFFDVEIGGKPAGRI
peptidyl-prolyl cis-trans VMGLFGKTVPKTVENFRALCTGEKGIGKSGKPLNYKGSQ
isomerase signature is underlined FHRIIPKFMIQGGDFTLGDGRGGESIYGNKFSDENFKLK
and the Cyclophilin-type peptidyl- HTDAGRLSMANAGPDTNGSQFFITTVTTSWLDGRHVVFG
prolyl cis-trans isomerase KVLSGMDVVHKIEAEGGQSGQPKSIVVISDSGELDL
signature is in bold.
57 The amino acid sequence of SEQ ID MGNPKVFFDMSIGGQPAGRIVMELYADVVPRTAENFRAL
323. The conserved cyclophilin- CTGEKGAGRSGKPLHYKGSSFHRVIPGFMCQGGDFTAGN
type peptidyl-prolyl cis-trans GTGGESIYGSKFADENFVKKHTGPGVLSMANAGPGTNGS
isomerase signature is underlined QFFVCTAKTEWLDGKHVVFGQIVDGMDVVKAIEKVGSSS
andThe cyclophilin-type peptidyl- GRTSKPVVVADCGQLS
prolyl cis-trans isomerase
signature 2 is in bold.
58 The amino acid sequence of SEQ ID MSPVAANAMEEAAEPEVPAPVTPSKDDADTDAAVSRFLG
324. The conserved A-box of the FCKSKLGLAEGNCVQSSTLLRKTAHVLRSSGTVIGTGTA
Retinoblastoma-associated protein EEAERYWFAFVLYTVRRVGERKAEDEQNGSDETEVPLSR
is underlined and the B-box of the ILKASVLNLIDFFKEIPQFVIKAGAIVSGIYGANWDSRL
Retinoblastoma-associated protein EAREMQTNYVHLCILCKFYKRICGEFFILNDAKDDMKSA
is in bold. DSSTSDPVIMYQPFGWLLFLALRIHALSRFKDLVSSTNA
LVSVLAILIIHLPTRFRKFSISDSSQLVKRSEKGVDLVG
SLAYRYDTSEDEIKRTLEKANNVIAEILGITPPPASECK
AENLENVDTDGLIYFGNLMEETSLSSILSTLEKIYEDAT
RNDSEFDERVFINDDDSLLVSGSLSGAAINLTGAKRKYD
SFASPAKTITRPLSPSRSPASHINGIIGGTNLRITATPV
ATAMTTAKWLRTFVSPLPSKPSTDLQGFLASCDRDVTSD
VIRRANIILEAIFPNSPIGERTVTGGLQNANLMDNMWAE
QRRLEALKLYYRVLEAMCRAEAQILHSNNLTSLLTNERF
HRCMLACSAELVLATHKTVTMLFPAVLERTGITAFDLSK
VIESFVRHEETLPRELRRHLNTLEERLLENMVWERGSSM
YNSLVVARPALAPEINRLGLLPEPMPSLDAIALLINFSS
SGLPQSPVQKHEASPGQNGDIRSPKRISTEYRSVLVERN
FTSPVKDRLLALSNIKSKLPPPPLQSAFASPTRPHPGGG
GETCAETAIHIFFSKITKLAAVRINAMLERLQLSQQIKE
GVYCLFQQILSQRTNLFFNRHIDQVILCCFYGVAKINQI
NLTFREIIYNYRKQPQCKPQVFRNVFVDWSTRRNGKAGN
EHVDIISFYNEIFIPSVKPLLVELGPTGATTRTNRTSEV
GNKNDAQCPGSPKISSFPTLPDMSPKKVSASHNVYVSPL
RSSKMDASISHSSKSYYACVGESTHAYQSPSKDLVAINS
RLNGNRKVRGTLNFDDVDAGLVSDSMVANSLYLQNGSSM
SSSTAKSSEK
59 The amino acid sequence of SEQ ID MRPILMKGHERPLTFLKYNREGDLLFSCAKDHTPTVWFA
325. The conserved G-protein beta DNGERLGTYRGHNGAVWCCDVSRDSMRLITGSADTTAKL
WD-40 repeat domains are WSVQNGTQLFTFNFDSPARSVDFSIGDKLAVITTDPFME
underlined. LPSAIHVKRIARDPADQASESVLVLRGHQGRIARAVWGP
LNKTIISAGEDAVIRIWDSETGKLLRESDKETGHKKAVT
SLMKSVDGSHFVTGSQDKSAKLWDIRTLTLIKTYVTERP
VNAVTMSPLLDHVVLGGGQDASAVTMTDHRAGKFEAKFF
DKILQEEIGGVKGHFGPINALAFNPDGKSFSSGGEDGYV
RLHHFDPDYFNIKI
60 The amino acid sequence of SEQ ID MDKKRTVVPLVCHGHSRPVVDLFYSPITPDGFFLISASK
326. The conserved G-protein beta DSSPMLRNGETGDWIGTFEGHKGAVWSCCLDTNALRAAS
domain is underlined and the WD-40 GSADFSAKLWDALSGDELHSFEHKHIVRSCAFSEDTHLL
repeat domains are in bold LTGGVEKILRIFDLNRPDAPPREVDNSPGSIRTVAWLHS
DQTILSSCTDIGGVRLWDVRSGKIVQTLETKSPVTSSEV
SQDGRYITTADGSTVKFWDANHFGLVKSYNMPCNIESAS
LEPKLGNKFIAGGEDMWVHIFDFHTGEEIGCNKGHHGPV
HCVRFSPGGESYASGSEDGTIRIWQTGPANNVEGDANPS
NGPVTGKAKVGADEVTRKVEDLQIGKEGKDWREG
61 The amino acid sequence of SEQ ID MAEGLILKGTMRAHTDMVTAIAIPIDNSDMVVTSSRDKS
327. The conserved G-protein beta IILWHLTKEEKVYGVPRRRLTGHSHFVQDVVLSSDGQFA1
WD-40 repeat domains are LSGSWDGELRLWDLATGVSARRFVGHTKDVLSVAFSIDN
underlined. RQIVSASRDRTIKLWNTLGECKYTIQEGEAHTDWVSCVR
FSPNTLQPTIVSASWDRTIKVWNLTNCKRNTLAGHNGY
VNTVAVSPDGSLCASGGKDGVILLWDLAEGKRLYNLEAG
AIIHSLCFSPNRYWLCAATENSIKIWDLESKSIVEDLRV
DLKNEADKTDGTTTAASNKKVIYCTSLNWSADGSTLFSG
YNDGVIRVWGTGRY
62 The amino acid sequence of SEQ ID MAEGLHLKGTMKAHTDMVTAIAVPIDNADMIVTSSRDKS
328. The conserved G-protein beta IILWHLTKEDKVYGVPRRRLTGHSHFVQDVVLSSDGQFA
WD-40 repeat domains are LSGSWDGELRLWDLATGVSARRFVGHTKDVLSVAFSIDN
underlined. RQIVSASRDRTIKLWNTLGECKYTIQEGEAHNDWVSCVR
FSPNTLQPTIVSASWDRTVKVWNLTNCKLRNTLQGHSGY
VNTVAVSPDGSLCASGGKDGVILLWDLAEGKKLYSLEAG
AIIHSLCFSPNRYWLCAATENSIKIWDLESKSIVEDLRV
DLKNEADMSDGTTGAMSSNKKVIYCTSLNWSADGSTLFS
GYNDGVIRVWGIGRY
63 The amino acid sequence of SEQ ID MAEGLHLKGTMKAHTDMVTAIAVPIDNADMIVTSSRDKS
329. The conserved G-protein beta IILWHLTKEDKVYGVPRRRLTGHSHFVQDVVLSSDGQFA
WD-40 repeat domains are LSGSWDGELRLWDLATGVSARRFVGHTKDVLSVAFSIDN
underlined and the Trp-Asp (WD) RQIVSASRDRTIKLWNTLGECKYTIQEGEAHNDWVSCVR
repeats signature is in bold. FSPNTLQPTIVSASWDRTVKVWNLTNCKLRNTLQGHSGY
VNTVAVSPDGSLCASGGKDGVILLWDLAEGKKLYSLEAG
AIIHSLCFSPNRYWLCAATENSIKIWDLESKSIVEDLRV
DLKNEADMSDGTTGAMSSNKKVIYCTSLNWSADGSTLFS
GYNDGVIRVWGIGRY
64 The amino acid sequence of SEQ ID MSGVPAPPFATTTPENGTMSSNSPAFHRDSDDDDDQGEV
330. The conserved G-protein beta FLDDSDIIHEVAVDDEDLPDADDEADEAEEADDSLHIFT
WD-40 repeat domains are GHNGEVYSLACSPTDATLVATGAGDDKGFLWRIGHGDWA
underlined. VELQGHKDSISSLAFSLDGQLLASGSLDGVIQIWDVPSG
NLKGTLDGPGGGIEWIRWHPKGHIILAGSEDSTVWMWNA
DKMAYLNMFSGHGNSVTCGDFTPDGKTICTGSDDATLRI
WNPKSGENIHVVKGHPYHAEGLTSMAISSDSGLAITGAK
DGSVRIVNISSGRVVSSLDAHADSVEFVGLALSSPWAAT
GSLDQKLIIWDLQHSSPRATCDHEDGVTCLSWVGASRFL
ASGCVDGKVRVWDSLSGDCVRTFHGHSDAIQSLSVSANE
EFLVSVSIDGTARVFEIAEFH
65 The amino acid sequence of SEQ ID MGTSQHQLSSCLQLLPRRRGNKNLIFRRTMASGGAAAVA
331. The conserved G-protein beta PPPGYKPYRHLKTLTGHVAAVSCVKFSNDGTLLASASLD
WD-40 repeat domains are KTLIIWSSAALSLLHRLVGHSEGVSDLAWSSDSHYICSA
underlined. SDDRTLRIWSSRSPFDCLKTLRGHTDFVFCVNFNPQSSL
IVSGSFDETIRIWEVKTGRCLNVIRAHSMPVTSVHFNRD
GSLIVSGSHDGSCKIWDTKNGACLKTLIDDTVPAVSFAK
FSPNGKFILVATLNDTLKLWNYATGKFLKIYTGHKNSVY
CLTSTFSVTNGKYIVSGSEDRCICIWDLQGKNLIQKLEG
HSDTVISVTCHPSENKIASAGLDSDRTVRIWLQDA
66 The amino acid sequence of SEQ ID MPSQKIETGHQDIVHDVAMDYYGKRVATASSDTTIKIIG
332. The conserved G-protein beta VSNSSGSQHLASLSGHKGPVWQVAWAHPKFGSILASCSY
WD-40 repeat domains are DGQVILWKEGNQNDWAQAHVFNDHKSSVNSIAWAPHELG
underlined. LCLACGSSDGNISVFTARPDGGWDTTRIEQAHPVGVTSV
SWAPSMAPGALVGSGLLDPVQKLASGGCDNTVKVWKLYN
GTWKMDCFPALQMHSDWVRDVAWAPNLGLPKSTIASASQ
DGTVVIWTVAKEGEQWQGKVLKDFKTPVWRVSWSLTGNL
LAVADGNNNVTLWNEAVDGEWQQVTTVEP
67 The amino acid sequence of SEQ ID MKIAGLKSVENAHDESVWAAAWVPATESRPALLLTGSLD
333. The conserved G-protein beta ETVKLWRPDELALERTNAGHFLGVVSVAAHPSGVIAASA
WD-40 repeat domains are SIDSFVRVFDVDTNATIATLEAPPSEVWQMQFDPKGTTL
underlined and the Trp-Asp (WD) AVAGGGSASIKLWDTATWELNATLSIPRPEQPKPSEKGN
repeats signature is in bold. KKFVLSVAWSPDGRRLACGSMDGTISIFDVARAKFLHHL
EGHFMPVRSLVFSPVEPRLLFSASDDAHVHMYDSEGKSL
VGSMSGHASWVLSVDVSPDGAALATGSSDRTVRLWDLSM
RAAVQTMSNHSDQVWGVAFRPMAGAGVRAGGRLASVSDD
KSISLYDYS
68 The amino acid sequence of SEQ ID MEIDLGNLAFDVDFHPSEQLVASGLITGDLLLYRYGDGS
334. The conserved G-protein beta SPEKLLEVRAHGESCRAVRFINDGKAILTGSPDCSILAT
WD-40 repeat domains are DVETGSVVARVENAHEAAVNRLVNLTESTIATGDDNGCI
underlined and the Trp-Asp (WD) KVWDTRQRSCCNTFSAHEDFISDMTFASDSMKLVVTSGD
repeats signature is in bold. GTLSVCNLRSNKVQTRSEFSEDELLSVVIMKNGRKVVCG
TQSGTLLLYSWGFFKDCSDRFVDLSPSSVDALLKLDEDR
IIAGTENGLISLIGILPNRIIQPIAEHSDHPIERLAFSH
DKKFLGSISHDQTLKLWDLNDILGSEDSPSSQAAIDDSD
SDEMDVDANPPDSSKGNKKKHSGKGNDVGNANNFFADLGD
69 The amino acid sequence of SEQ ID MSQQPSVILATASYDHTIRFWEAKSGRCYRTIQYPDSQV
335. The conserved G-protein beta NRLEITPHKRYLAVAGNPSIRLFDVNSNTPQPVMSFDSH
WD-40 repeat domains are TNNVMAVGFQYDGNWMYSGSEDGTVRIWDLRARGCQREY
underlined and the Trp-Asp (WD) ESRGAVNTVVLHPNQTELISGDQNGNIRVWDLTANSCSC
repeats signature is in bold. ELVPEVDTAVRSLTVMWDGSLVVAANNNGTCYVWRLLRG
SQTMTNFEPLHKLQAHNGYILKCLLSPEFCEPHRYLATA
SSDHTVKIWNVEGFTLEKTLIGHQRWVWDCVFSVDGAYL
ITASSDTTARLWSMSTSQDIRVYQGHHKATTCCALHDGA
EGSPG
70 The amino acid sequence of SEQ ID MEDAMDMEVEVEVEAEEHSPSSSNPSGSSFRRFGLKNSI
336. The conserved G-protein beta QTNFGSDYVFEITPKFDWSLMGVSLSSNAVKLYSPTTGQ
WD-40 repeat domains are YCGECRGHSDTVNGISFSGPSSPHVLHSCSSDGTIRAWD
underlined. TRSFKEVSCISAGPSQEIFSFSFGGSSDSLLSAGCKSQI
LFWDWRNKKQVACLEDSHVDDVTQVCFVPHHQNKLISAS
VDGLICIFDTAGDINDDEHMESVINVGTSIGKVGIFGQT
FEKLWCLTHIETLSVWDWKEGTNEANFEDARKLASDSWS
LDHIDYFVDCHSAEEGEGLWVIGGTNAGTLGYFPVKYKG
GAAIGSPEAVLGGGHSDVVRSVLPMSGMAGTTSKTRGIF
GWTGGEDGRLCCWLSDDSSATSRSWMSSNLVLKSSRSHH
KKNRHQPY
71 The amino acid sequence of SEQ ID MSQHQEYPMEYAADDYDVGEVEDDMYFHERVMGDSDTDE
337. The conserved G-protein beta DEEYDHLDNKITDTSAADARRGKDIQGIPWERLSVTREK
domain is underlined and the WD-40 YRRTRIEQYKNYENVPQSGESSEKDCKPTRKGGNYYEFW
repeat domains are in bold RNTRSVKSTILHFQLRNLVWSTTKHDVYLMSHFSIIHWS
SLTCKKTEVLDVYGHVAPREKHPGSLLEGFTQTQVSTLA
VRDKLLIAGGFQGELICKNLDRPGVSYCCRTTYDDNAIT
NAVEIYDYPSGAVHFMASNNDCGVRDFDMEKFELSRHFT
FPWPVNHTSLSPDGKLLVIVGDNPEGIVVDSQRGKTIRP
LQGHLDFSFASAWHPDGHIFATGNQDKTCRIWDIRNLSK
SVAVLKGNLGAIRSIRFTSDGRFMAMAEPADFVHVYDVK
SGYEKEQEIDFFGEISGVSFSPDTESLFVGVWDRTYGSL
LQYNRCRNYSYLDSM
72 The amino acid sequence of SEQ ID MGASSDPNPDVSDEHQKRSEIYTYEAPWHIYAMNWSVRR
338. The conserved G-protein beta DKKYRLAIASLLDHPAAAAAVPNRVEIVQLDDSTGEIRA
WD-40 repeat domains are DPNLSFDHPYPATKAAFVPDKDCQRADLLATSSDFLRIW
underlined. RIADDSSRVDLRSFLNGNKNSEFCRPLTSFDWNEAEPKR
IGTSSIDTTCTIWDIERETVDTQLIAHDKEVYDIAWGGV
SVFASVSADGSVRVFDLRDKEHSTIIYESSEPDTPLVRL
GWNKQDPRYMATIIMDSAKVVVLDIRYPTMPVVELQRHQ
ASVNAIAWAPHSSCHICTAGDDSQALIWDLSSMAQPVEG
GLDPILAYTAGAEIEQLQWSSSQPDWVAIAFSLKLQ
73 The amino acid sequence of SEQ ID MRGGGGGGDATGWDEDAYRESVLKEREVQTRTVFRAAFA
339. The conserved G-protein beta PSPSPSPSPDAVVVASSDGSVASYSISACLSDHRLQSLR
WD-40 repeat domains are FADAKSQNVLEAEPACFLQGHDGPAYDVKFYGEGEDSLL
underlined. LSCGDDGRIRGWMWRDITSSEAHDHSQGNSAKPVLDLVN
PQSRGPWGALSPIPENNALAVDVKRGSIYAAAGDSCAYC
WDVECGKIKTVFKGHSDYLHCIAARNSSSQIITGSEDGT
ARIWDCRSGKCVQVIDPDKDHKKGFFASVSCLALDASES
WLVCGRGRDLSVWSISASDCIAKISTNAPAQDVLFDDNQ
ILLVGAEPLISRLDMNGAVLSQIHCAPQSVFSVSLHQSG
VTAVGGYGGLVDVISQFGSHLCTFRCKCI
74 The amino acid sequence of SEQ ID MEAPIIDPLQGDFPEVIEEYLEHGIMKCIAFNRRGTLLA
340. The conserved G-protein beta AGCTDGSCIIWDFETRGVAKELRDKECTAAITSVCWSKY
WD-40 repeat domains are GHRILVSASDKSLILWDVLSGEKIAHTTLQHTVLQACLH
underlined. PGSSTPSICLACPFSSAPMIVDLNTGSTTALPVLTADVS
NGATPLSRNKTSDTSVTYSPCNACFNKHGDLVYAGTSKG
EILIIDHKNVRVCAIVLVSGGAVIKNVVFSRNGQYMLTN
SNDRLIRIYKNLLPPKDGLKMLDELNESFNESDDVEKLK
AIGSKCLELLHEFQDSITRVQWKAPCFSGDGEWVIGGAA
SRGEHKIYIWDRAGHLVKILEGPKEALMDLAWHPVHPII
ISVSLTGLVYIWAKDYTENWSAFAPDFKELEENEEYVER
EDEFDLVPETEKVKGLDVHEDDEVDVLTVERDSVFSDSD
MSQEELCFLPAVPCLDIPEQQDKCVGSCSKLPDGNHSGS
PLSVEAGQNGNASNHNSSPLEPMENSTADDTDGVRLKRK
RKPSEKGLELQAEKVKKPVKPLKSSGRLSKTNKPVIDPD
SSNGVYGDDGSD
75 The amino acid sequence of SEQ ID MRGVSWPEDGNNPSTSSSSQRNQQQAHAPRAVSGHAASH
341. The conserved G-protein beta PSASNIFKLLVQREVSPRSKHSSKKLWREASKCQPYPFQ
WD-40 repeat domains are QSCEAVRDVRQGLISWVESASLRHLSAKYCPLVPPPRST
underlined. IAAAFSPDGKILASTHGDHTVKLIDSQTGSCLKVLRGHR
RTPWVVRFHPLYPEILASGSLDHEVRLWDANTAECIGSR
NFYRPIASIAFHARGELLAVASGHKLYIWHYNRRGETSS
PTIVLRTQRSLRAVHFHPHAAPFLLTAEVNDLDSADSAM
TLATSPGYLHYPPPTVYFADAHSHERSRLADELPLMPLP
LLMWPSFTRDDGRVPLQRIDGDVGLNGQQRVDSSSSVRL
WTYSTPSGQYELLLSPVESGNSPSMPEETGNNAFSSAVE
AEVSQSAMDTVEDMEVQPEERNTQFFSFSDPRFWELPLL
HGWLVGQTQAGPRSVRQSSPGDIETQSAFGEVASVSPIT
SGVMPVSMDPSRFGGRSGSRYRSPGSRGVHVTGPNNDGP
RDENDPQSVVSKLRSELAASLAAAASTELPCTVKLRIWP
HDVKDPCAQLDLESCRLTIPHAVLCSEMGAHFSPCGRFL
AACVACVLPHLESDPGLHGQVNQDVTGVATSPTRHPISA
HQIMYELRIYSLEEATFGIVLASRPVRAAHCLTSIQFSP
TSEHLLLAYGRRHSSLLKSIVIDGENTVPIYTILEVYRV
SDMELVRVLPSAEDEVNVACFHPSVGGGLIYGTKEGKLR
ILHYDSSHGLNLKSSGFLDENVPEVQTYALEC
76 The amino acid sequence of SEQ ID MDSAVAIAALSLVVGAAIALLFFGNYFRKRRSEVVAMAE
342. The conserved G-protein beta ADLQPHPKNPSRPPPQPAAKKVHAKSHAHGADKDKNKRH
WD-40 repeat domains are HPLDLNTLKGHGDSVTGLCFASDGRSLATACADGVVRVF
underlined and the Trp-Asp (WD) KLDDASNKSFKFLRINLPAGGHPTAVAFGDGVSSVIVAS
repeats signature is in bold. QHLSGCSLYMYGEEKPTNLDSNKQQTKLPMPEIKWEHHK
VHEQKAILTLSGAAANYDSGDGSTIIASCSEGTDIIIWH
AKTGKILGNVDTNQLKNTMSAISPNGRFIAAAAFTADVK
VWEIVYSKDGSVKGVTKVMQLKGHKSAVTWLCFTPNSEQ
IVTASKDGSIRIWNINVRYHLDEDTKTLKVFPIPLQDSS
GTTLHYERLSLSPDGKILAATHGSMLQWLCIETGKVLDT
AEKAHDGDITCMSWAPQSIPTGDKKVNVLATASGDKKVK
LWAAPPLPS
77 The amino acid sequence of SEQ ID MEVEPKKASKTFPVKPKLKPKPRTPSGKTPESKYWSSFK
343. The conserved G-protein beta TTHPLDNLSFSVPSLAFSPSPPHLLAAAHSATVSLFSPH
WD-40 repeat domains are RTTISSFSDVVSSLSFRSDGQLLAASDLSGLIQVFDVRS
underlined. RTPLRRLRSHARPVRFVRYPVLDKLHLVSGGDDALVKYW
DVAGESVVSELRGHKDYVRCGDCSPADANCFVTGSYDHV
VKLWDVRVRDGNRAATEVNHGSPVQDVIFLPSGSLVATA
GGNSVKIWDLIGGGRMVYSMESHNKTVTSICVGTMGAQQ
SGEEGVQLRILSVGLDGYMKVFDYSRMKVTHSMRFPAPL
LSIGFSPDSNVRAIGTSNGILYVGKRKAKENAEGGANGI
LGLGSVEEPRRRVLKPSFYRYFHRGQSEKPSEGDYLVMR
PKKVKLAEHDKLLKKFQHKNALISVLGGNDPEKVVAVME
ELVARRALLKCVLNLDADELGLILTFLHKNSTVPRYSSL
LLGLAKKVIDLRLEDIRASDALKGHIRNLKRSVDEEIRI
QEGLQEIQGMVSPLLRIAGRR
78 The amino acid sequence of SEQ ID MQGGSSGVGYGLKYQARCISDVKADTDHTSFLTGTLSLK
344. The conserved G-protein beta EENEVHLLRLSSGGTELICEGLFSHPSEIWDLSSCPFDQ
WD-40 repeat domains are RIFSTVFSTGESYGAAVWQIPELYGQLNSPQLEKIASLD
underlined. AHSRKISCVLWWPSGRHDKLVSIDEENIFLWGLDCSKKS
AQVQSQESAGMLHNLSGGAWDPHDVNTVAATCESSIQFW
DLRTMKKANSLESVHARDLDYDMRKKHLLVTSEDESGVR
VWDLRMPKAPIQEFPGHTHWTWAVRCNPDYEGLILSAGT
DSAVNLWWSSTASSDELISERLIDSPTRKLDPLLHSYND
YEDSVYGLAWSSREPWIFASLSYDGRVVVESVKPFLSRK
79 The amino acid sequence of SEQ ID MAEEEGSAELEQQLEEEFAVWKKNTPILYDLLISHALEW
345. The conserved G-protein beta PSLTVHWAPLLPQPSSSAAAAAGDPSLAAHRLVLGTHTS
WD-40 repeat domains are DGAPNFLILADALLPSSESDHCGDDAVLPKVEISQKIRV
underlined. DGEVNRARFMPQNHNIVGAKTNGCEVYVFDCSKQAAKQH
DGGFDPDLRLTGHDGEGYGLSWSPLKENYLLSASHDKKI
CLWDISAAAQDKVLGAMHVFEAHEGAVGDASWHSKNDNL
FGSAGDDCQLMIWDLRTNKAQQCVKAHEKEVNSVSFNSY
NDWILATASSDTTVGLFDMRKLTTPLHVFSSHEGEVLQV
EWDPNHEAVLASSSEDRRVMVWDLNRIGDEQQEGDASDG
PAELLFSHGGHKAKISDFSWNKNEPWVISSVAEDNSVQV
WQMAESICGDDDDMQAMEGYI
80 The amino acid sequence of SEQ ID MGNYGEEDEDQYFDALEETASVSDRGSNSSDCCSSGSGL
346. The conserved G-protein beta DENVLDSLGFEFWTKFPESVRARRNRFLMLTGLGIEANS
WD-40 repeat domains are VDKEDAFPPSCNEIEVYTCKVTRDDGAVQRSLDSYNCIS
underlined. LLQSSTSIRSNQEVESLRGDSLLSSFRGRSKESDDLTEL
CGMGCPESKRNAVSEFGSVSQGSIEELRRIVASSPLVHP
LLHRKLEYERELIETKQKMGAGWLRKFGSATCISGRQGD
TWSDPDDLEITAGMKMRRVRAHSSKKKYKELSSLYAAQE
FLAHEGSISTMKFSMDGQYLASAGEDTVVRVWKVTEEDR
SERVNVTVDPSCLYFALNESTQLASLNTNKEHIGKAKTF
QRSSDSSCVILPLKVFQITEKPWHEFKGHNGEVLDLSWS
SKGYLLSSSTDKTVRLWRVGCDRCQRVYSHNDYVTCISF
NPVNENFFISGSIDGKVRIWNVFGGQVVAYIDCREIVSA
VCYRSDGKGAIVGTMTGNCLFYSIKDNHLQMDAQVYLHG
KKKSPGKRITGFQFPPNDPGKLMITSADSVIRVLSGLDV
VCKLKGPRNSGGPMIATFTSDGKHVISASEDSNVYIWNY
AGQDKTSSRVKKIWSCESFWSSNASVALPWCGIRTVPEA
LAPPSRSEERRASCAENGENHHMLEEYFQKMPPYSPDCF
SLSRGFFLELLPKGSATWPEEKLSDTSPPTVSSQAISKL
EYKFLKSACHSVLSSAHMWGLVIVTAGWDGRIRTYHNYG
LPVRS
81 The amino acid sequence of SEQ ID MDIDFKEYRLRCELRGHEDDVRGVCVCGDGSIGTSSRDR
347. The conserved G-protein beta TVRLWAPSAGERRKYEVARVLLGHKSFVGPLAWVPPSEE
WD-40 repeat domains are LPEGGIVSGGMDTLVMAWDLRNGEAQTLKGHQLQVTGIV
underlined. LDGGDIVSASVDCTLIRWKNGQLTEHWEAHKAPIQAVIR
LPSGELVTGSSDTTLKLWRGKTCTQTFVGHTDTVRGLAV
MPDLGILSASHDGSIRLWAVSGECLMEMVDHTSIVYSVD
SHASGLIVSGSEDRFAKIWKDGVCFQSIEHPGCVWDVKF
LEDGDIVTACSDGTIRIWTNQEDRMANSTELELFDLELS
SYKRSRKRVGGLKLEELPGLEALQVPGTSDGQTKVIREG
DNGVAYAWNSTELKWDKIGEVVDGPEDSMNRPALDGVQY
DYVFDVDIGDGEPTRKLPYNRSDNPYDTADKWLLKENLP
LSYRQQIVEFILANSGQRDFNLDPSFRDPYTGSSAYVPG
APSQLAAKQARPTFKHIPKKGMLVFDAAQFDGILKKINE
FNNTLLSNQEKKNLSLTDIEISRLGAVVKILKDTSHYHS
SKFADADFDLMLKLLESWPYEMMFPVIDIFRMVILHPDG
ADGLLRHQEDKKDVLMESIKRATGNPSVPANFLTSIRAV
TNLFKNSAYYSWLQKHRSEMLDAFSSCSSSSNKNLQLSY
ATLLLNYAVLLIEKKDEEGQSQVLSAALELAENESLEVD
ARYRALVAIGSLMLDGLVKRIALDFDVEHIAKAARTSKE
AKIAEVGADIELLIKQS
82 The amino acid sequence of SEQ ID MEFTEAYKQSGPCCFSPNARFIAVAVDYRLVIRDTLSLK
348. The conserved G-protein beta VVQLFSCLDKISYIEWALDSEYILCGLYKRPMIQAWSLI
domain is underlined and the WD-40 QPEWTCKIDEGPAGIAYARWSPDSRHILTTSDFQLRLTV
repeat domains are in bold WSLVNTACVHVQWPKHASKGVSFTRDGKFAAICTRHDCK
DYINLLSCHNWEIMGVFAVDTLDLADIQWSPDDSAIVIW
DSPLEYKVLVYSPDGRCLFKYQAYESGLGVKSVSWSPCG
QFLAVGSYDQMLRVLSHLTWKTFAEFTHLSNVRAPCCAA
IFKEVDEPLQIDMSELSLSDDYMQGNSGDAPEGHYRVRY
DVTEVPITLPCQKPPADRPNPKQGIGLMSWSNDSQYICT
RNDSMPTILWIWDMRHLELAAILVQKDPIRAAVWDPTGT
RLVLCTGSSHLYMWTPSGAYCVSVPLSQFNITDLKWNSD
GSCLLLKDKESFCCAAAPLPPDESSDYSSDD
83 The amino acid sequence of SEQ ID MATIAALDDDMVRSMSIGAVFSDFVGKLNSLDFHRKDDI
349. The conserved G-protein beta LVTAGEDDSVRLYDIANARLLKTTFHKKHGTDRVCFTHH
WD-40 repeat domains are PNSLICSSTKNLDTGESLRYISMYDNRSLRYFKGHKQRV
underlined. VSLCMSPINDSFMSGSLDHSVRMWDLRVNACQGILRLRG
RPTVAYDQQGLVFAVAMEGGAIKLFDSRSYDKGPFDAFL
VGGDTSEVCDIKFSNDGKSVLLSTTNNNIYVLDAYAGDK
QCGFNLEPSPSTPIEASFSPDGQYVVSGSGDGTLHAWNI
SRRNEVACWNSHIGVASCLKWAPRRAMFVAASTVLTFWI
PNSEPELASAKGEAGVPPEQV
84 The amino acid sequence of SEQ ID MSVAELKERHRAATETVNSLRERLKQKRVQLLDTDVAGY
350. The conserved G-protein beta ARTQGKTPVTFGATDLVCCRTLQGHTGKVYSLDWTPERN
WD-40 repeat domains are RIVSVSQDGRFIVWNALTSQKTHAIRLPCAWVMTCAFAP
underlined and the beta G-protein NGQSVACGGLDSVCSIFNLNSPVDRDGNLPVSRMLSGHK
(transducin) is in bold. GYVSSCQYVPDGDAHLITGSGDQTCVLWDITTGLRTSVF
GGEFQSGHTADVLSVSINGSSPRIFVSGSCDSTARMWDT
RVASRAVHTYHGHEGDVNAVKFFPDGNRFGTGSDDGTCR
LFDIRTGHELQVYYQQRGIDEIPHVTSIAFSISGRLLIA
GYSNGDCFVWDTLLAQVVLNLGSLQNSHEGRISCLGVSA
DGSALCTGSWDTNLKIWAFGGIRRVT
85 The amino acid sequence of SEQ ID MKKRPRGASLDQAVVDIRRREVGGLSGLSFARRLAASEG
351. The conserved G-protein beta LVLRLDIYNKLKGHRGCVNTVGFNLDGDIVISGSDDRHV
domain is underlined and the WD-40 KLWDWQTGKVKLSFDSGHLSNVFQAKIMPYTDDRSIVTC
repeat domains are in bold AADGQARHAQILEGGQVQTMLLAKHRGRAHKLAIDPGSP
HIVYTCGEDGLVQRLDLRSNTARELFTCREVYGTHVEVV
HLNAIAIDPRNPNLFVIGGSDEYARVYDIRNYKWNGSHN
FGRSANYFCPSHLIGEAHVGITGLAFSGQSELLVSYNDE
SIYLFTQEMGLGPDPLSASTKSVDSNSSEVTSPTAVNVD
DNVTPQVYKGHRNCETVKGVGFFGPKCEYVVSGSDCGRI
FIWKKKGGQLIRVMAADKHVVNCIEPHPHIPALASSGIE
NDIKIWTPKAIERATLPMNVEQLKPKARGWMNRISSPRQ
LLLQLYSLERWPEHGGETSSGLAASQEELTELFFALSAN
GNGSPDGGGDPSGPLL
86 The amino acid sequence of SEQ ID MSKRGYKLQEFVAHSSNVNCLSIGKKACRLFLTGGDDCK
352. The conserved G-protein beta VNLWAIGKPNSLMSLCGHTNAVESVAFDSAEVLVLAGAS
WD-40 repeat domains are SGVIKLWDVEEAKLVRGLTGHRSNCTAMEFHPFGEFFAS
underlined and the Trp-Asp (WD) GSTDTNLKIWDIRKKGCIHTYKGHTRGISTIRFSPDGRW
repeats signature is in bold. VVSGGNDNVVKVWDLTAGKLLHDFKFHENHIRSIDFHPL
EFLLATGSADRTVKFWDLETFELIGSSRPEAAGVRAIAF
HPDGRTLFCGLEDSLKVYSWEPVICHDGVDMGWSTLADL
CIHDGKLLGCSYYQSSVGVWVADASLIEPYGTNVKPQQK
DSGDDEIEHQESRPSAKVGTTIRSTSIMRCASPDYETKD
IKNIYVDTASGNPVSSQRVGTTNFAKVTQPLDFNDTPNL
TLRRQGLVTETPDGLSGHVPSKSITQPKVVSRDSPDGKD
SSRRESITFSRTKPGMLLRPAHSRRPSSTKYDVDRLSAC
AEIGVLSSAKSGSESLVDSFLNIKVAPEDGARNGCEDNH
SSVKNVSVESEKVLPLQTPKTEKCDQTVGFKEEINSVKF
VNGVAVVPGRTRTLVEKFEKREKLNSTEDQTINTPENPT
LDKTPPPSLAENEEKSDRLNIVERKATRMSSHMVTAEDR
TPVTLVGSPEDQSTVMAPQRELPADESSKTPPLPVEDLE
IHHGSNVSEDKATILSSQTVSEEDSKRSTLIRNFRRRDR
FKSTEGRSPVMATQRKLPTDESGKTSSLPMEDLEIKGGL
NVSEDKATSFSSRAPPREDRAHSALVRNVRKRDKFKSTN
DTITVMVHQRGLSTDEASTVSVERVERRQLSNNVENPLN
NLPPHSVPPTTTRGEPQYVGSESDSVNHEDVTELLLGNH
EVFLSTLRSRLTKLQVV
87 The amino acid sequence of SEQ ID MSTFLTGTALSNPNPNKSYEVVQPPNDSVSSLSFNPKAN
353. The conserved G-protein beta FLVATSWDNQVRCWEIVRSGTSLGTTPKASISHDQPVLC
WD-40 repeat domains are STWKDDGTTVFSGGCDKQVKMWPLSGGQPMTVAMHDAPI
underlined. KEISWIPEMNLLVTGSWDKTLRYWDTRQANPVHIQQLPE
RCYALTVRHPLMVVGTADRNLIIYNLQSPQTEFKRISSP
LKYQTRCLAAFPDQQGFLVGSIEGRVGVHHLDDSQQSKN
FTFKCHREGSEIYSVNSLNFHPVHHTFATAGSDGAFNFW
DKDSKQRLKAMSRCSQPIPCSTFNNDGSIFAYSACYDWS
KGAENHNPATAKTYIFLHLPQESEVKGKPRLGTTGRK
88 The amino acid sequence of SEQ ID MEVEAQQRDVNNVMCQLVDPEGTTLGPPMYLPQDVGPQQ
354. The conserved G-protein beta LQQMVNKLLSNEDKLPYTFYISDQELVVPLESYLQKNKV
WD-40 repeat domains are SVEKVLSIVYQPQAIFRIRPVNRCSATIAGHSEAVLSVA
underlined and the Trp-Asp (WD) FSPDGKQLASGSGDTTVRLWDLSTQTPMFTCKGHKNWVL
repeats signatures are in bold. SIAWSPDGKHLVSGSKAGEIQCWDPLTGQPSGNPLVGHK
KWITGISWEPVHLSSPCRRFVSSSKDGDARIWDVTLRRC
VICLSGHTLAVTCVKWGGDGVIYTGSQDCTIKVWETSQG
KLIRELKGHGHWVNSLALSTEYVLRTGAFDHTGKQYSSA
EEMKQVALERYKKMKGNAPERLVSGSDDFTMFLWEPSVS
KHPKTRMTGHQQLVNHVYFSPDGQWVASASFDKSVKLWN
GITGKFVAAFRGHVGPVYQISWSADSRLLLSGSKDSTLK
IWDIRTKKLKRDLPGHADEVFAVDWSPDGEKVVSGGKDK
VLKLWMG
89 The amino acid sequence of SEQ ID MDAGSAHSSSNMKTQSRSPLQEQFLQRRNSRENLDRFIP
355. The conserved G-protein beta NRSAMDFDYAHYMLTEGRKGKENPAVSSPSREAYRKQLA
WD-40 repeat domains are ETLNMNRTRILAFKNKPPTPVELIPHELTSAQPAKPTKT
underlined. RRYIPQTSERTLDAPDLLDDYYLNLLDWGSSNVLSIALG
NTVYLWNASDGSTSELVTIDDETGPVTSVSWAPDGRHIA
VGLNNSDVQLWDSADNRLLRTLRGGHRSRVGSLAWNNHI
LTTGGMDGLIVNNDVRVRSHIVDTYRGHTQEVCGLKWSA
SGQQLASGGNDNILHIWDRSTASSNSPTQWLHRLEEHTA
AVKALAWCPFQGNLLASGGGGGDRTIKFWNTHTGACLNS
VDTGSQVCALLWNKNERELLSSHGFTQNQLTLWKYPSMV
KIAELTGHTSRVLFMAQSPDGCTVASAAGDETLRFWNVF
GVPEVAKPAPKANPEPFAHLNRIR
90 The amino acid sequence of SEQ ID MEEAIPFKNLPSREYQGHKKKVHSVAWNCTGTKLASGSV
356. The conserved G-protein beta DQTARVWHIEPHGHGKVKDIELKGHTDSVDQLCWDPKHA
WD-40 repeat domains are DLIATASGDKTVRLWDARSGKCSQQAELSGENINITYKP
underlined and the Trp-Asp (WD) DGTHVAVGNRDDELTILDVRKFKPIHKRKFNYEVNEIAW
repeats signature is in bold. NMSGEMFFLTTGNGTVEVLAYPSLRPVDTLMAHTAGCYC
IAIDPVGRYFAVGSADSLVSLWDISEMLCVRTFTKLEWP
VRTISFNHTGDYVASASEDLFIDISNVQTGRTVHQIPCR
AAMNSVEWNPKYNLLAYAGDDKNKYQADEGVFRIFGFESA
91 The amino acid sequence of SEQ ID MGKDEEEMRGEIEERLINEEYKVWKKNTPFLYDLVITHA
357. The conserved G-protein beta LEWPSLTVEWLPDREEPPGKDYSVQKLVLGTHTSENEPN
WD-40 repeat domains are YLMLAQVQLPLEDAENDARHYDDDRADVGGFGCANGKVQ
underlined. IIQQINHDGEVNRARYMPQNSFIIATKTVSAEVYVFDYS
KHPSKPPLDGACSPDLRLRGHSTEGYGLSWSKFKQGHLL
SGSDDAQICLWDINATPKNKSLDAMQIFKVHEGVVEDVA
WHLRHEYLFGSVGDDQYLLIWDLRTPSVTKPVQSVVAHQ
SEVNCLAFNPFNEWVVATGSTDKTVKLFDLRKISTALHT
FDAHKEEVFQVGWNPKNETILASCCLGRRLMVWDLSRID
EEQTPEDAEDGPPELLFIHGGHTSKISDFSWNTCEDWVV
ASVAEDNILQIWQMAENIYHDEDDVPGEESNKGS
92 The amino acid sequence of SEQ ID MMRGFSCTEDGDAPSTSSTSPPPPPPPPHRQQMQAPRAS
358. The conserved G-protein beta SSSSGQPTSRRSTGNVFKLLARREVSPRSKHSLKKFWGE
WD-40 repeat domains are ASECQLCPFQQSYEAVRDVRRSLISWVEAFSLQHLSAKY
underlined. CPLMPPPRSTIAAAFSPDGKILASTHGDHTVKLIDSQTG
SCLKVLRGHRRTPWVVRFHPLYPEILASGSLDHEVHLWD
ANTAECIGSRNFYRPIASIAFHAQGDLLAVASGHKLYIW
HYNRSGETSSPTIVLRTPRSLRAVHFHPHAAPFLLTAEV
NDLDLTDSAMTLATSPGYLHYPPPTIYLADAHSNERSRL
EDELPLMPSPLLMWPSFTRDDGRATLPHIGGDVGLSGQQ
RVDSLSSGQYEFHPSPIEPSSSTSMHEEMGTDPFSSVRE
SEVTQSAMNIVDNTEVQPEERSTYSFSFSDPRFWELPSV
YGWLVGQTQAAPRTAPSPGALETASALGEVASVSPVRSE
FMPGGMDQPRLGGRSGSGCRSSGSRMMRTAGLNDHPHDE
NYPQSVVSKLRSELEASLAAAASTELPCTVKLRVWPYDM
KDPCALFRSESCRLTIPHAVLCSEMGAHFSPCGRFFAAC
VACVLPQLEADPVLHGQVDPDVTGVATSPTRHPVSAYQI
MYELRIYSLEEATFGMVLASRSIRAAHCLTSIQFSPTSE
HLLLAYGRRHNSLLKSIVIDGENTVPIYSILEVYRVSDM
ELVRVLPSAEDEVNVACFHPSVGGGLVYGTKEGKLRILQ
IDSSGGLNPKSTGFLDENMAEVPTYALEC
93 The amino acid sequence of SEQ ID MGEGDLPRTEAGVLRGHEGAVLAARFNGDGNYCLSCGKD
359. The conserved G-protein beta RTIRLWNPHRGIHIKTYKSHGREVRDVHCTSDNSKLISC
WD-40 repeat domains are GGDRQIFYWDVSTGRVIRRFRGHDSEVNAVKFNDYASVV
underlined. VSAGYDRSVRAWDCRSHSTEPIQIINTFQDSVMSVCLTK
TEIIGGSVDGTVRTFDIRIGREISDDLGQPVNCISMSND
GNCILASCLDSTLRLVDRSAGELLQEYKGHTCKSYKLDC
CLTNTDAHVAGGSEDGYVFFWDLVDASVISKFRAHSSVV
TSVSYHPKEDCMITASVDGTIKVWKT
94 The amino acid sequence of SEQ ID MACIKGVGRSASVAMAPDGGYLATGTMAGTVDLSFSSSA
360. The conserved G-protein beta SLEIFGLDFQSDDRDLPLIAESPSSERFNRLSWGKNGSG
WD-40 repeat domains are SDEFSLGLIAGGLVDGTIGLWNPLSLIRSEAGDKAIVGH
underlined LSRHKGPVRGLEFNVIAPNLLASGADDGEICIWDLAAPR
EPSHFPPLRGSGSAAQGEISFLSWNSKVQHILASTSYNG
TTVVWDLKKQKPVISFSDSVRRRCSVLQWNPDLATQLVV
ASDEDSSPTLRLWDMRNIMSPVKEFAGHTRGVIAMSWCP
NDSSYLVTCAKDNRTICWDTVTGEIVCELPAGSNWNFDV
HWYPKIPGVISASSFDGKIGIYNVEGCSRYGVRENEFGA
ATLRAPKWFKRPVGASFGFGGKVVSFHTRSTGGPSVNSS
EVFVHDIITEQTLVSRSSEFEAAIQSGDRPSLRALCEKK
SQHCESTDDQETWGFLKVLLEDDGTARSKLLAHLGFDIP
TETNDGSQEDLSQQVNALGLEDVTADKVVQEDNNESMVF
PTDNGEDFFNNLPSPRADTPVSTSADGFPTVNAAVEPSQ
DEVDGLEESSDPSFDDSVQRALVVGDYKAAVALCMSANK
LADALVIAHVGGASLWESTRDKYLKMSRLPYLKVVFAMV
NNDLQSLVDTRPLKFWKETLAILCSFAQGEEWAMLCNSL
ASKLMAAGNMLAATLCFICAGNIDKTVEIWSRSLATEHD
GMSYMDLLQDLMEKTIVLALASGQKQFSASVCKLVEKYA
EILASQGLLTTAMDYLKLLGTDDLSPELAVLRDRIAFSV
EAEKGANISAFNGSQDPRGAVYGVDQSNYGMVDTSQHYY
PEAAQPQVPHTVPGSPYGENYQQPFGSSFGKGYNTPMQY
QAPSQASMFVPSEPPQNAQPSFVPTPVTSQPTTRSQFIP
APPLALRNPEQYQQPTLGSHLYPGSVNPTFQPLPHAPGP
VAPVPPQVSSVPGQNMPQAVAPTQMRGFMPVTNPGVVQN
PGPISMQPATPIESAAAQPVVSPAAPPPTVQTADTSNVP
APQKPVIATL
95 The amino acid sequence of SEQ ID MKERGKGAGRSVDERYTQWKSLVPVLYDWLANHNLVWPS
361. The conserved G-protein beta LSCRWGPQLEQATYKNRQRLYLSEQTDGSVPNTLVIANV
WD-40 repeat domains are EVVKPRVAAAEHISQFNEEARSPFVKKFKTIIHPGEVNR
underlined. IRELPQNSKIVATHTDSPDVLIWDVETQPNRHAVLGAST
SRPDLILTGHKDNAEFALAMSPTEPFVLSGGKDRYVVLW
SIQDHISTLAADPGSAKSPGSAGTNNKQSSKAAGGNDKT
GDSPSIEPRGVYLGHGDTVEDVTFCPSSAQEFCSVGDDS
CLILWDARTGSSPAIKVEKAHHADLHCVDWNPHDVNLIL
TGSADNTVRMFDRRNLTSGGVGSPVHTFEGHNAAVLCVQ
WSPDKSSVFGSSAEDGILNIWDHEKIGRKIETVGSKVPN
SPPGLFFRHAGHRDKVVDFHWNSSDPWTIVSVSDDGEST
GGGGTLQIWRMIDLIYRPEEEVLAELDKFKSHILSCTS
96 The amino acid sequence of SEQ ID MAKIAPGCEPVAGTLTPSKKREYRVTNRLQEGKRPLYAV
362. The conserved G-protein beta VFNFIDSRYFNVFATVGGNRVTVYQCLEGGVIAVLQSYI
WD-40 repeat domains are DEDKDESFYTVSWACNIDRTPFVVAGGINGIIRVIDAGN
underlined and the Trp-Asp (WD) EKIHRSFVGHGDSINEIRTQPLNPSLIVSASKDESVRLW
repeats signature is in bold. NVHTGICILIFAGAGGHRNEVLSVDFHPSDKYRIASCGM
DNTVKIWSMKEFWTYVEKSFTWTDLPSKFPTKYVQFPVF
IAPVHSNYVDCNRWLGDFVLSKSVDNEIVLWEPKMKEQS
PGEGSVDILQKYPVPECDIWFIKFSCDFHYHSIAIGNRE
GKIYVWELQSSPPVLIAKLSHPQSKSPIRQTAMSFDGST
ILSCCEDGTIWRWDAITASTS
97 The amino acid sequence of SEQ ID MNTAMHFGAGWRSIAEMGYTMSRLEIEPESCEDEKSLDG
363. The conserved G-protein beta VGNSQGPNELPRCLDHELAHLTNLKSRPHEHLIRDFPGR
WD-40 repeat domains are RALPVSTVKMLAGRECNYSRRGRFSSADCCHMLSRYVPV
underlined. NGPSPLDQMNSRAYVSQFSADGSLFVAGFQGSHIRIYNV
DKGWKCQKNILTKSLRWTITDTSLSPDQRYLVYASMSPI
VHIVDIGSAAMDSLANITEIHEGLDFSADSGPYSFGIFS
VKFSTDGREVVAGSSDDSIYVYDLVANKLSLRIPAHESD
VNTVCFADESGHIIYSGSDDTYCKVWDRRCLSARNKPAG
VLMGHLEGITFIDSRGDGRYFISNGKDQTIKLWDIRKMG
SDICRRGFRNFEWDYRWMDYPPRARDSKHPFDLSVATYK
GHSVLRTLIRCYFSPVHSTGQKYIYTGSHDSCVYIYDVV
TGAQVAALKHHKSPVRDCSWHPEYPMIVSSSWDGDIVKW
EFFGNGETEIPAMKKRIRRRHLY
98 The amino acid sequence of SEQ ID MEPQPQAPKKRGRKPKPKEDKKEEQLHQPPPPPPPQQQA
364. The conserved G-protein beta APAPAPAATRSSTSGSAGGRDRRPQQQHAVDEKYARWKS
WD-40 repeat domains are LVPVLYDWLANHNLLWPSLSCRWGPQLEQATYKNRQRLY
underlined. ISEQTDGSVPNTLVIANCEVVKPRVAAAEHVSQFNEEAR
SPFIRKYKTIIHPGEVNRVRELPQNPNIVATHTDSPDVL
IWDVESQPNRHAVYGATASRPNLILTGHQENAEFALAMC
PAEPFVLSGGKDKTVVLWSIQDHITASATDQTTNKSPGS
GGSIIKKTGEGNEETGNGPSVGPRGIYCGHEDTVEDVAF
CPSTAQEFCSVGDDSCLILWDARVGTNPVAKVEKAHNGD
LHCVDWNPHDNNLILTGSADNSVNMFDRRNLTSNGVGSP
VYKFEGHKAAVLCVQWSPDKPSVFGSSAEDGLLNIWDYE
RVDKKVDRAPNAPAGLFFQHAGHRDKIVDFHWNAADPWT
MVSVSDDCDTAGGGGTLQIWRMSDLIYRPEEEVLAELEN
FKAHVLECSKA
99 The amino acid sequence of SEQ ID MGIFEPYRAVGYITTGVPFSVQRLGTETFVTVSVGKAFQ
365. The conserved G-protein beta VYNCAKLSLVLVGPQLPKKIRALASYREYTFAAYGSDIG
WD-40 repeat domains are IFKRAHQLATWSGHTAKVCLLLLFGEHILSVDVDGNAYI
underlined and the Trp-Asp (WD) WAFKGMNYNLSPVGHILLDSNFTPSCIMHPDTYLNKVIL
repeats signature is in bold. The GSQEGPLQLWNISTKTKLYEFKGWNSSVSSCVSSPALDV
Utp21 specific WD40 associated VAVGCADGKIHVHNIRYDEELVTFSHSMRGSVTALSFST
putative domain is in italics. DGQPLLASGSSSGVVSIWNLDKRRLQSVIRDAHDGSIIS
LHFFANEPVLMSSSADNSIKMWIFDTSDGDPRLLRFRSG
HSAPPLCIRFYANGRHILSAGQDRAFRLFSVVQDQQSRE
LSQRHVSKRAKKLKLKEEEIKLKPVIAFDVAEIRERDWC
NVVTSHMDTPQAYVWRLQNFVIGEHILRPCPNKPTPVKA
CMISACGNFAILGTAGGWIERFNLQSGISRGSYIDQLEG
TNSAHDGEVVGVACDATNTLMISAGYAGDIKVWDFKGRE
LKSRWEIGSSLVKISYHRLNGLLATVADDFIIRLFDAVA
LRMVRKFEGHTDRITDLCFSEDGKWLLSSSMDGSLRIWD
IILARQVDAVFVDVSITALSLSPNMDILATTHVDQNGVF
LWVNQSMFSGDSDINLYASGKEVVTVKLPSVSSVEGSQV
EESNEPTIRHSESKDVPSFRPSLEQIPDLVTLSLLPKSQ
WQSLINLDIIKVRNKPVEPPKKPEKAPFFLPSIPSLSGE
ILFKPSEMSDKGDMKADEDKSKITPEVPSSRFLQLLHSC
SEAKNFSPFTTYIKGLSPSTLDLELRMLQIIDDDAVDAD
ADDPQDVDKRQELLSIELLMDYFIHEISCRSNFEFVQAL
VRLFLKIHGETIRRQSVLQNKAKVLLETQCSVWQRVDKL
FQGARCMVAFLSNSQF
100 The amino acid sequence of SEQ ID MEETKVTCGSWIRRPENVNLAVLGRSPRRRGSAALEIFA
366. The conserved G-protein beta FDPKSTSLSSSPLVAHVIEEIEGDPLAIAVHPNGEDIVC
WD-40 repeat domains are FASSGSCLSFELSGQESNLKLLTKELPPLRGIGPQKCMA
underlined. FSVDGSRFATGGVDGRLRILEWPSLRIILDEPKAHKSIR
DLDFSLDSEFLATTSTDGSARIWKAEDGLPCTTLTRRSD
EKIELCRFSKDGTKPFLFCTVQRGDKAVTGVWDISTWNK
IGHKRLLRKPAVVMSISLDGKYLAQGSKDGDMCVVEVKK
MEVSHWSKRLHLGTSLTSLEFCPIERVVITTSDEWGVLV
TKLNVPADWKAWQVYLLLLGLFLASLVAFYIFYENSDSF
WGFPLGKDQPARPKIGSVLGDPKSADDQNMWGEFGPLDM
101 The amino acid sequence of SEQ ID MADPVEHQHQQHQQHQLQQQRRRGWRIQGGQYLGEISAL
367. The conserved G-protein beta CFLHLPPPPLSLSSSPVLSLSSGLDSESRDRPACSFRFP
WD-40 repeat domains are SAGSGSQVSLFDLASGAMVRTFYVFRGIRVHGIVLGCAD
underlined. FPGGSSSSSSTLDYVIAVYGERRVKLFRLSVRLGRGAGE
GSGTVLSADLELVSAAPRLSHWVMDVRFLKENGTSEDEL
QRCLTVAIGCSDNSIRLWDVDKCSFVLAVSSPERCLLYS
MRLWGDNLEDLQVASGTIYNEILIWKVVPNHDAPSSNEL
TEEGLTNSCAGNSVHECLRYEAYHICRLVGHEGSIFRIA
WSSDGSKLVSVSDDRSARIWEVHCKVQYSEDAGEVGLLF
GHSARVWDCYISDNLIVTAGEDCSCRVWGLDGQQHDVIK
EHIGRGIWRCLYDPWSSLLVTGGFDSAIKVHKLDASLAE
ASAKQSNIKDLSDGTELFTTHLPNSSGHSGHMDSKSEYV
RCLSFSCEDVMYIATNHGYLYHAKLCNDGDLRWTELAQV
SNEVQIICMELLPSNPYDPRIDADDWVAVGDGKGWTTVV
RVVKNSDSPKVSTSFSWAAEMDRQLLGIHWCKSLGHRFI
FTADPRGALKLWRFFEVSQSSSLYPENSPRISLIAEFKS
DLGARIMCLDVAFESELLICGDLRGNLVLFPLLKDLLLD
TFVVSAAKISPVNHFKGAHGISAVSSISVAHMSFNHIEL
RSTGADGCICYMEYDKGLQSLNFVGMKQVKELSMIESVS
TENESTGYRTSGSYASGFASTDFIIWNLVTEAKVLQVSC
GGWRRPHSYYLGDVPEMKNCFAYVKDDIIYIRRHWIKDS
KDKILPQNLRLQFHGREVHSLCFVTGDFQLRKNKQSSWI
VTGCEDGTVRLTRYTQCTDNWSSSKLLGEHVGGSAVRSI
CCVSNIHTTSSGTSVSDVKGIENLPKDIKGTLMEDECNP
SLLISVGAKRVLTSWLLRRRKQDGKEDDVTDLQEAENSS
LPSSAGSSTFSFQWLSTDMPVKYSVPSKKSGSIKKLIGV
SDTNVRCKSL
102 The amino acid sequence of SEQ ID MPYKLSATLSNHSSDVRAVASPSDDLILSASRDSTAISW
368. The conserved G-protein beta FRQSPSSFTPASVIRAGSRFVNAIAYLPPTPRAPQGYAV
WD-40 repeat domains are VGGQDTVVNVFALGPGDKEEPEYTLVGHTDNVCALSVNS
underlined. DDTIISGSWDKTAKVWKDFALVYDLKGHQQSVWAVLAMN
EKEFLTASADRTIKYWVQHKTMQTYEGHRDAVRGLALIP
DIGFASCSNDSEIRVWTMGGDVVYTLSGHTSFVYSLSVL
PNGDLVSAGEDRSVRVWRDGECSQVIVHPAISVWAVSTM
PNGDIISGSSDGVVRVFSESEKRWATASELKALEDQIAS
QSLPSQQVGDVKKTDLPGPEALSVPGKKAGEVKMIRSGD
VVEAHQWDSLASSWQKIGEVVDAIGSGRKQLHDGKEYDY
VFDVDIQEGAPPLKLPYNVSENPYTAAQRFLEQNDLPTG
YLDQVVKFIEQNTAGVKLGNDGYVDPFTGASRYQPATQS
TSNTASSSYMDPFTGGSRHIAESAPSNVPQGSHATGIIP
FSKPIFFKLANVSAMQAKMFQFDEVLRNEISTATLAMRP
DEVIMVNETFTYLSKVVTSTSSARTSLGWIHIETIMQIL
DRWPVPQRFPVIDLGRLVTAYCMNAFSGPGDLEKFFSCL
FRTSEWTSITSGSKALTKAQETNVLLLFRTIANSLDGAP
LNDMEWIKQIFRELAQTPQLVLNKSHRLALASVLFNFSC
IGLKGPVPADVRTLHLTIILQVLRSPNDDPEVAYRTCVA
LGNMLYSDKTRGTPRDAQSPSPTELKSAVAAIKGGFSDP
RINDVHREIMSLI
103 The amino acid sequence of SEQ ID MPPQKIESGHKDTVHDLAMDYYGKRLATASSDHTINVVG
369. The conserved G-protein beta VSSSGSQHLATLIGHQGPVWQISWAHPKFGSLLASCSYD
domain is underlined and the WD-40 GRVIIWREGNPNEWTQAQVFEEHKSSVNSVAWAPHELGL
repeat domains are in bold CLACGSSDGNISVFTARQDGGWDTSRIDQAHPVGVTSVS
WAPSTAPGALVGSGMMEPVQKLCSGGCDNTVKVWKLYNR
VWKLDCFPVLQMHTDWVRDVAWAPNLGLPKSTIASASQD
GRVIIWTLAKEGDQWQGKVLYDFRTPVWRVSWSLTGNIL
AVADGNNNVSLWNEAVDGEWIQVSTVEP
104 The amino acid sequence of SEQ ID MSAPMLEIEARDVVKIVLQFCKENSLHQTFQTLQSECQV
370. The conserved G-protein beta SLNTVDSIETFVADINSGRWDAILPQVAQLKLPRNTLED
WD-40 repeat domains are LYEQIVLEMIELRELDTARAILRQTQAMGVMKQEQPERY
underlined and the Trp-Asp (WD) LRLEHLLVRTYFDPNEAYQDSTKEKRRAQIAQALAAEVT
repeats signature is in bold. VVPPSRLMALVGQALKWQQHQGLLPPGTQFDLFRGTAAM
KQDVDDMYPTTLSHTIKFGTKSHAECARFSPDGQFLVSC
SVDGFIEVWDYMSGKLKKDLQYQADETFMMHDDPVLCVD
FSRDSEMLASGSQDGKIKVWRIRTGQCLRRLERAHSQGV
TSVLFSRDGSQLLSTSFDGSARIHGLKSGKQLKEFRGHS
SYVNDAIFSNDGSRVITASSDCTVKVWDVKTSDCLQTFK
PPPPLRGGDASVNSVHLFPKNADHIVVCNKTSSIYIMTL
QGQVVKSLSSGKREGGDFVAACVSPKGEWIYCVGEDRNL
YCFSCQSGKLEHLMKVHEKDVIGVTHHPHRNLVATYSED
STMKLWKP
105 The amino acid sequence of SEQ ID MDLLQSYAEDNDGDLGRHSSPEPSPPRLLPSKSAAPKVD
371. The conserved G-protein beta DTTLALTVAQTNQTLARPIDPSQHAVAFNPTYDQLWAPI
WD-40 repeat domains are CGPAHPYAKDGIAQGMRNHKLGFVEDAAIGSFLFDEQYN
underlined. TFQRYGYAADPCASTGNEYVGDLDALKQNDGISVYNIRQ
QEQKKYAEEYAKKKGEERGEGGREKAEVVSDKSTFHGKE
ERDYQGRSWIAPPKDAKATNDHCYIPKRLVHTWSGHTKG
VSAIRFFPKHGHLILSAGMDTKVKIWDVFNSGKCMRTYM
GHSKAVRDISFCNDGTKFLTAGYDKNIKYWDTETGKVIS
TFSTGKIPYVVKLHPDDEKQNILLAGMSDKKIVQWDMNT
GQITQEYDQHLGAVNTITFVDDNRRFVTSSDDKSLRVWE
FGIPVVIKYISEPHMHSMPSISLHPNTNWLAAQSLDNQI
LIYSTRERFQLNKKKRFAGHIVAGYACQVNFSPDGRFVM
SGDGEGRCWFWDWKSCKVFRTLKCHEGVCIGCEWHPLEQ
SKVATCGWDGLIKYWD
106 The amino acid sequence of SEQ ID MESNGNLEQTLQDGRIYRQLNSLIVAHLRDHNFPQAASA
372. The conserved G-protein beta VALATMTPLNVEAPRNRLLELVAKGLAVEKGELLRGVSH
WD-40 repeat domains are AGTNDLGGSIPASYGLVPAPWTAIDFSSLRDTKGMSKSF
underlined. TKHETRHLSDHKNVARCARFSTDGRFFATGSADTSIKLF
EVSKIKQMMLPDSTDGAIRAVIRTFYDHTHPVNDLDFHP
QNTVLISAAKDHTVKFFDYSKATAKRAFRVIQDTHNVRS
VAFHPSGDFLLAGTDHPIPHLYDVNTFQCYLSANVPEFA
VNAAINQVRYSSSGGMYVTASKDGTIRFWDGASANCVRS
IAGAHGAAEVTSANFTKDQRYVLSCGKDSTVKLWEVGTG
RLVKQYLGATHMQLRCQAVFNNTEEFVLSIDEPSNEIVV
WDAMTAEKVARWPSNHNGPPRWIEHSPTEAAFVSCSTDR
SIRFWKETH
107 The amino acid sequence of SEQ ID MSNFQGEDGEYVADDFEAEDGDEELHGRESADPESDVDE
373. The conserved G-protein beta IDTPSNRFTDTTADQARRGRDIQGIPWERLSITREKYRR
WD-40 repeat domains are TRLEQYKNYENVPQSGEKSGKDCTVTEKGNSFYEFRRNS
underlined. RSVKSTILHFQLRNLVWATSKHDVYLMSNYSVVHWSSLT
GKKSEVLNLAGHVAPNEKHPGSLLEGFTQTQVSTLAVKD
RFLVAGGFQGELICKFLDRPGISFCSRTTYDDNAITNAV
EIYVSPSGGIHFIASNNDCGVRDFDMENFELSKHFRFPW
PVNHTSLSPDGKLLVIVGDDPEGILVDAKTGKTIMPLRG
HLDFSFASEWHPDGVTFATGNQDKTCRIWDIRNLSKSIA
VLKGNLGAIRSIRYTSDGRYMAIAEPADFVHVYDTKTGY
KKEQEIDFFGEISGMSFSPDTESLFIGVWDRTYGSLLEY
GRRRNFSYLDCLV
108 The amino acid sequence of SEQ ID MGVEEDLEDLNALAESTDAAVDGQAALASAVDSVTLQPA
374. The conserved G-protein beta PPILPPVIPPPAVPVVAPVPTIPPVLRPLAPLPIRPPVL
WD-40 repeat domains are RPPAPKRDEAGSSDSDSDHDGTAAGSTAEYEITEESRLV
underlined and the splicing factor RERHEKAMQDLMMKRRGAALAVPTNDKAVRARLRRLGEP
motif is in bold. MTLFGEREMERRDRLRMLMAKLDAEGQLEKLMKAHEDEE
AAASAAPEDVEEEMLQYPFYTEGSKALFNARIDIAKFSI
TRAALRLERARRRRDDPDEDVDAEIDWALKKAESLSLHC
SEIGDDRPLSGCSFSHDGKLLATCSMSGVAKLWDTCRMP
QVNRVLTLKGHTERATDVAFSPVQNHIATASADRTAKLW
NTEGTILKTFEGHLDRLGRIAFHPSGKYLGTTSFDKTWR
LWDIESGEELLLQEGHSRSIYGIDFHRDGSLVASCGLDA
LARVWDLRTGRSILALEGHVKPVLGVSFSPNGYHLATGG
EDNTCRIWDLRKKKSLYTIPAHANLISEVKFEPQEGYFL
VTASYDTTAKVWSARDFKPVKTLSVHEAKITSVDITADA
SHIVTVSHDRTIKLWTSNDDVKEQAMDVD
109 The amino acid sequence of SEQ ID MVKAYLRYEPAAAFGVIASVESNIAYDASGKHLLAPALE
375. The conserved G-protein beta KVGVWHVRQGVCTKALAPSASSAAGPSLAVTAIASSPSS
WD-40 repeat domains are LIASGYADGSIRIWDFEKGSCETTLNGHKGAVSVLRYGK
underlined, and the conserved LGSLLASGSKDNDIILWDVVGETGLYRLRGHRDQVTDLV
Dip2/Utp12 domain is in bold. FLDSDKKLVSSSKDKYLRVWDLETQHCMQIVGGHHSEIW
SLDTDPEERYLVTGSADPELRFYTVKNDSSDERSEADAS
GGVGNGDLASHNKWDVLKQFGEIQRQSKDRVATVRFNKN
GNLLACQAAGKLVEVFRVLDEAEAKRKAKRRLHRKREKK
GADVNENSDSSRGIGEGHDTMVTVADVFKLLQTIRASKK
ICSISFCPVAPKSSLATLALSLNNNLLEFHSIEADKTSK
MLTIELQGHRSDVRSVTLSSDNTLLMSTSHNSVKIWNPS
TGSCLRTIDSGYGLCGLIVPQNKHALIGTKDGAIEIFDV
GSGTCIEVVEAHGGSIRSIVAIPNQNGFVTGSADHDIKF
WEYGMKQKPGDNSKHLTVSNVRTLKMNDDVLVVAVSPDA
QKIAVALLDCTVKVFFMDSLKLMHSLYGHRLPVLCLDIS
SDGDLIVTGSADKNLMIWGLDFGDRHKSIFAHGDSIMAV
QFVGNTHYMFSVGKDRLVKYWDADKFELLLTLEGHHADI
WCLAISNRGDFLVTGSHDRSIRRWDRTEEPFFIEEEKEK
RLEEMFESDLDNAFGNKYVPKEEIPEEGAVALAGKKTQE
TLSATDSIIEALDIAEVELKRIAEHEEEKNNGKTAEFHP
NYVMLGLSPSDFILRALSNVQTNDLEQTLLALPFSDALK
LLSYLKDWTTYPDKVELVSRIATVLLQTHYNQLVSTPAA
RPLLTTLKDILHKKVKECKDTIGFNLAAMDHLKQLMALR
SDALFQDAKVKLLEIRSQLSKRLEERTDPREAKRRKKKQ
KKSTNMHAWP
110 The amino acid sequence of SEQ ID MGGVQAEREDKDKVSLELTEEILQSMEVGMTFRDYSGRI
376. The conserved G-protein beta SSMDFHRASSYLVTASDDESIRLYDVASATCLKTINSKK
WD-40 repeat domains are YSVDLVSFTSHPMTVIYSSKNGWDESLRLLSLHDNKYLR
underlined. YFKGHHDRVVSLSLCPRNECFISGSLDRTVLLWDQRAEK
CQGLLRVQGRPATAYDDPGLVFAIAFGGCVRMFDARKYE
KGPFEIFSVGGDVSDANVVKFSNDGRLMLLTTTDGHIHV
LDSFRGTLLYTFNVKPTSSKSTLEASFSPEGMFVISGSG
DGSVYAWSVRGGKEVASWLSTDTEPPVIKWAPGNLMFAT
GSSELSFWIPDLSKLGAYVGRK
111 The amino acid sequence of SEQ ID MAAFGAAPAGNHNPNKSSEVIQPPSDSVSSLCFSPRANH
377. The conserved G-protein beta LVATSWDNQVRCWELTKNGASVTSVPKASMSHDQPVLCS
WD-40 repeat domains are AWKDDGTTVFSGGCDKQAKMWSLMSGGQPVTVAMHDAPI
underlined. KEIAWIPEMNVLVTGSWDKTLKYWDTRQSNPVHTQQLPE
RCYAMTVRYPLMVVGTADRNLIVFNLQNPQAEFKRFSSP
LKYQTRCVAAFPDQQGFLVGSIEGRVGVHHLDDSQISKN
FTFKCHRDNNDIYSVNSLNFHPVHHTFATAGSDGTFNFW
DKDSKQRLKAMSRCSQPIPCSTFNNDGTIYAYSVCYDWS
KGAENHNPATAKTYIFLHLPQESEVKAKPRVGTTNRK
112 The amino acid sequence of SEQ ID MNCSISGEVPEEPVVSTKSGHVFERRLIERYVSDYGKCP
378. The conserved G-protein beta VSGEPLTMDDVLPVKMGKIVKPRPLQAASIPGLLSIFQN
WD-40 repeat domains are EWDSLMLSNFALEQQLHTARQELSHALYQHDAACRVIAR
underlined. LKKERDEARSLLALAERQIPMTASSDIAVNAPAMSNGRK
ASLDEEPGYAGKKMRPGISASIIAEITDCNLALSQQRKK
RQIPSTLAPVEDLERYTQLSSYPLHKTGKPGITSLDICH
SKDIIATGGIDTSAVLFDRSSGQIMSTLSGHSKKVTSVN
FDAQGDMVLTGSADKTVRIWQGSEDGSYNCRHILKDHTA
EVQAITVHATNNYFATASLDNTWCFYEFSTGLCLTQVEG
ASGSEGYTSAAFHPDGLILGTGTSNADVKIWDVKTQANV
TTFSGHTGAITAISFSENGYFLATAAQDGVKLWDLRKLK
NFRTFSAYDKDTGTNSVEFDHSGCYLGLAGSDIRVYQVA
SVKSEWNCVKTFPDLSGTGKVTCVKFGPDSKYIAVGSMD
HNLRIFGLPSEDGAMES
113 The amino acid sequence of SEQ ID MAAPGVETLKKEIKELKEKIAQHRLDTDGEQPLPAAAKS
379. The conserved G-protein beta KSVPEVSAALKQRRILKGHFGKIYALHWSADSRHLVSAS
domain is underlined and the WD-40 QDGKLIIWNGFTTNKVHAIPLRSSWVMTCAYSPSGNLVA
repeat domains are in bold CGGLDNLCSVYKVPHGGNKESSSAQKTYGELAQHEGYLS
CCRFIKDNEIVTSSGDSTCILWDVETKTPKAIFNDHTGD
VMSLAVFDDKGVFVSGSCDATAKLWDHRVHKQCVMTFQG
HESDINSVQFFPDGDAFGTGSDDSSCRLFDIRAYQQINK
YSSDKILCGITSVAFSKTGKSLFAGYDDYNTYVWDTLSG
NQVEVLTGHENRVSCLGVSEDGKALATGSWDTLLKIWA
114 The amino acid sequence of SEQ ID MGGVEDESEPASKRMKLSSRVLRGLANGSSRTEPAAGSS
380. The conserved G-protein beta LDLMARPLPIEGDEEVIGSKGVIKRVEFVRLIAKALYSL
WD-40 repeat domains are GYEKSGARLEEESGIPLQSSVVNLFMQQISDGLWDESVV
underlined. TLHKIGLSDENLVKSASFLILEQKFLELLDQEKAMDALK
TLRTEITPLCIKNSRVRELSSCIISPSSCGLLNQNKRNS
TRARSRSELLEELQKLLPPAVIIPERRLEHLVEQALVLQ
TDACMLHNSIDMEMSLYTDHQCGKEHIPCRTLQILQSHN
DEVWLVQFSHNGKYLASASNDRSAIIWEVDENGSVSLKH
KLTGHQKPISSVCWSPDDRQLLTCGVGETVRRWDVSSGE
CLRVYEKAGHGLISCAWFPDGKWICYGVSDRSICMCDLE
GKEIECWKGQRTLSISDLEITSDGKQIISICRETAILLL
DREAKYERMIEENQTITSFSLSKDNRYLLVNLLNQEIHL
WDIKGDFRLVAKYKGLKRSRFVIRSCFGGLKQAFVASGS
EDSQVYIWHKGSGELIEPLPGHSGAVNCVSWNPANHHML
ASASDDRTIRIWGLNELNTRHKGARPNGVHYCNGNGTS
115 The amino acid sequence of SEQ ID MTQLAETYACMPSTERGRGILIAGNPKPGSNSVLYTNGR
381. The conserved G-protein beta SVVILNLDNPLDISVYAEHAYPATVARFSPNGEWVASAD
WD-40 repeat domains are SSGAVRIWGAYNDHVLKKEFKVLSGRIDDLQWSPDGLRI
underlined. VASGDGKGKSLVRAFMWDSGTNVGEFDGHSRRVLSCAFK
PTRPFRIVTCGEDFLVNFYEGPPFKFKLSRRDHSNFVNC
LRFSPDGNRFISVSSDKKGIIYDGKTGEKIGELSSDGGH
TGSIYAVSWSPDSKQVITVSADKSAKIWDISEDGSGNLR
KTLTSSGSGGVDDMLVGCLWQNNHLVTVSLGGTISIYTA
GDLDKAPVSFSGHMKNVSSLSVLKGDPKVILSSSYDGLI
IKWIQGIGFSGRVQRKESTQIKCLAAVDEEIVTSGYDNK
VCRVSGSGDAEFIDIGCQPKDLSLALQCPEFALVSTDTG
VVLLRGAKIVSTINLGFAVTASTVAPDGTEAIIGAQDGK
LRIYSISGDTLTEEAVLEKHRGAISVIHYSPDLSMFASG
DLNREAVVWDRASREVRLKNILYHTARINCLAWSPDSST
VATGSLDTCVIIYEVDKPASNRLTIKGAHLGGVYGLAFT
DDFSVVSSGEDACIRVWKINRQ
116 The amino acid sequence of SEQ ID MKVKVISRSTDEFTRERSQDLQRVFRNFDPNLRTQEKAV
382. The conserved G-protein beta EYVRALNAAKLDKVFARPFVGAMDGHVDSVSCMAKNPNY
WD-40 repeat domains are LKGIFSGSMDGDIRLWDIASRRTVCQFPGHQGPVRGLAA
underlined and the SOF1 protein STDGQILVSCGIDSTVRLWNVPVATLGESDGTHENLAKP
domain is in bold. LAVYVWKNAFWAVDHQWDGELFATAGAQVDIWNQNRSQP
ISSFEWGTDTVISVRFNPGEPNVLATSGSDRSITLYDLR
MSSPTRKVIMRTKTNAISWNPMEPMNFTAANEDCNCYSY
DARKLEEAKCVHKDHVSAVMDIDYSPTGREFVTGSYDRT
VRIFQYNGGHSREVYHTKRMQRVFCVKFSCDASYVISGS
DDTNLRLWKAKASEQLGVVLPRERRKHEYHEAVKSRYKH
LPEVKRIVRHRHLPKPIYKAGILRRTVNEADRRKEERRK
AHSAPGSSSAEPLRKRRIIKEIE
117 The amino acid sequence of SEQ ID MVRSIKNPKKAKRKNKGSKNGDGSSSSSSIPSMPTKVWQ
383. The conserved G-protein beta PGVDKLEEGEELQCDPSAYNSLHAFHIGWPCLSFDIVRD
WD-40 repeat domains are TLGLVRTEFPHQVYFVAGTQAEKPTWNSIGIFKVSNITG
underlined. KRRELVPSKPTDDADEESDSSDSDEDSDDEVGGSGTPIL
QLRKVGHEGCVNRIRAMNQNPHICASWGDSGHVQIWDFS
SHLNALAESEADVSQGASSVFNQAPLVKFGGHKDEGYAL
DWSPLVPGRLVSGDCKNSIHLWEPTSGSTWNVDSTPFIG
HAASVEDLQWSPTEENVFASCSVDGTIAIWDTRLGKTPA
ASFKAHDADVNVISWNRLATCMLASGCDDGTFSIHDLRL
LKEGDSVVAHFEYHKHPVTSIEWSPHEASTLAVSSADCQ
LTIWDLSLEKDEEEEAEFKAKTKEQVNAPEDLPPQLLFV
HQGQKDLKELHWHAQIPGMIVSTAADGFNILMPSNIQST
LPSDGA
118 The amino acid sequence of SEQ ID MERYKVIKELGDGTYGSVWKALNQQTHEIVAIKKMKRKY
384. The conserved eukaryotic YIWEECINLREVKSLRKLNHPNIIKLKEVIRENNELFFI
protein kinase domain is FEYMECNLYQIMKERSTPFSETAIIKFCYQILQGLSYMH
underlined and the protein kinases RNGYFHRDLKPENLLVTSDLIKIADFGLAREVLTSPPYT
ATP-binding region and DYVSTRWYRAPEVLLQSPTYTTAIDMWAVGAILAELFTL
serine/threonine protein kinases HPLFPGESELDEIYKICGVLGTPDYETWPDGMQLAAFRN
active-site signatures are in FIFPQFLPVNLSVLIPHASPEAIDLITRLCSWDPQKRPT
bold. AEQALHHPFFRIGMSIPLSLGGHFQDNTCAAEVDTKFHS
KKACKAWNGEKESSLECFLGLSLGLKPSLGHLGAMGSQG
VGAVKQEVGSSPGCQSNPKQSLFQVLNSRAILPLFSSSP
NLNVVPVKSSLPSAYTVNSQVMWPTIAGPPAAAVTVSTL
QPSILGDFKIFGKSMGLASQYAGKEASPFS
119 The amino acid sequence of SEQ ID MGEMGRGINNSSNNNNSNRPAWLQHYDLVGKIGEGTYGL
385. The conserved eukaryotic VFLARSKLPNNRGLRIAIKKFKQSKDGDGVSPTAIREIM
protein kinase domain is LLREFSHENVVKLVNVHINHVDMSLYLAFDYAEHDLYEI
underlined and the protein kinases IRHHREKLNHHNINQYTVKSLLWQLLNGLNYLHSNWIVH
ATP-binding region and RDLKPSNILVMGEGEEHGVVKIADFGLARIYQAPLKPLS
serine/threonine protein kinases DNGVVVTIWYRAPELLLGAKHYTSAVDMWAVGCIFAELI
active-site signatures are boxed TLKPLFQGVEVKASPNPFQLDQLDKIFKVLGHPTIEKWP
in bold. TLMNLPHWSKNLQQIQQHKYDNAGLHIGPIPAKSPAYDL
LSKMLEYDPRKRITAAQALEHEYFRIDPQPGRNALVPSQ
PGEKAINYPPRLVDANTDFDGTIAPQPSQVSSGNAPSGS
IASAAVPAVRPLPQQMQLMGMQRMQNPGMAAFNLGAQAS
MSGLNHNNIALQRGSSQQQAHQQVRRKEPNSGFPNTGYP
PPPKSRRL
120 The amino acid sequence of SEQ ID MDKYEKLEKVGEGTYGKVYKARDKMTGQLVALKKTRLEM
386. The conserved protein kinase DEEGVPPSSLREISLLQMLSQSIYVVRLLCVEHVTKKGK
family domain is underlined. The PLLYLVFEYLDTDLKKFIDYRRSVNAGPLPQNVIQSFMY
protein kinases ATP-binding region QLLKGVAHCHSHGVLHRDLKPQNLLVDKSKGLLKVGDLG
is in bold and the LGRAFTVPLKCYTHEVVTLWYRAPEVLLGSTHYSTPVDI
serine/threonine protein kinases WSVGCIFAEMVRRQPLFPGDCEIQQLLHIFTLLGTPTEE
active-site signature is in MWPGVKRLRDWHEYPQWKPENLARAVPNLSPTGLDLISK
bold/italics. MLQCDPAKRISAKAAMNHPYFDDLDKSQF
121 The amino acid sequence of SEQ ID MDGYEKMDKVGEGTYGKVYMARDKKTGQLVALKKTRLEN
387. The conserved protein kinase DGEGIPPTALREISLLQMLSQDIYIVRLLDVKHTENKLG
family domain is underlined. The KPLLYLVFEYMESDLKKYIDSYRRSHTKMPPSMIKSFMY
protein kinases ATP-binding region QLCRGVAYCHSRG DKEKGVLKIADLG
is in bold and the LSRAFTVPVKKYTHEIVTLWYRAPEVLLGATHYSLPVDI
serine/threonine protein kinases WSVGCIFAEMSRMQALFTGDSEVQQLMNIFRFLGTPNEE
active-site signature is in VWPGVTKLKDWHIYPEWKPQDISHAVPDLEPSGLDLLSQ
bold/italics. MLVYEPSKRISAKKALEHPYFDDLDKSQF
122 The amino acid sequence of SEQ ID MDAYEKLEKVGEGTYGKVYKAKDKNTGQLVALKKTRLES
388. The conserved eukaryotic DDEGIPPTALREISLLQMLSQDIHIVRLLDVEHTENKNG
protein kinase domain is KPLLYLVFEYMDSDLKKYIDGYRRSHTKVPPNIIKSFMY
underlined and the protein kinases QLCQGVAYCHSRGVMHRDLKPHNLLVDKQRGVVKIADLG
ATP-binding region and LGRAFTIPIKKYTHEIVTLWYRAPEVLLGATHYSTPVDI
serine/threonine protein kinases WSVGCIFAEMVRLQALFIGDSEVQQLFKIFSFLGTPNEE
active-site signatures are in IWPGVTKFRDWHIYPQWKPQDISSAVPDLEPSGVDLLSK
bold. MLVYEPSKRISAKKALEHPYFDDLDKSQF
123 The amino acid sequence of SEQ ID MDSYEKLEKVGEGTYGKVYKAKDKKTGKLVALKKTRLEN
389. The conserved protein kinase DGEGIPPTALREISLLQMLSQDMNIVRLLDVEHTENKNG
family domain is underlined. The KPLLYLVFEYMDSDLKKYVDGYRRSHTKMPPKIIKSFMY
protein kinases ATP-binding region QLCQGVAYCHSRG DKQRGVLKIADLG
is in bold and the LGRAFTVPIKKYTHEIVTLWYRAPEVLLGATHYSTPVDI
serine/threonine protein kinases WSVGCIFAEMSRMHALFCGDSEVQQLMSIFKFLGTPNEG
active-site signature is in VWPGVTKLKDWHIYPEWRPQDLSRAVPDLEPSGVDLLTK
bold/italics. MLVYEPSKRISAKKALQHPYFDDLDKSQF
124 The amino acid sequence of SEQ ID MEKYEKLEKVGEGTYGKVYKGRDKRTGRLVALKKTPFHQ
390. The conserved eukaryotic EEGIPPTAIREISLLKSLSQCIYIVKLLDVKASFNGKGK
protein kinase domain is HVLFMVFEYADSDLKKHIDAHRQCNTKLSPRSIQSYMFQ
underlined and the protein kinases LCKGIAYCHSHGVLHRDLKPQNILVDQKIGLLKIADLGL
ATP-binding region and GRACTVPIKSYTFEVVTLWYRAPEVLLGAKRYSMALDIW
serine/threonine protein kinases SLGCIFAELCNLQALFAGDSQIQQLINIFRLLGTPNEQL
active-site signatures are in WPGVTQLSDWHEFPQWRPQDLSKVVFNLDPNGVDLLSKM
bold. LQYDPAKRISAKEALDHPYFDSLDKSQF
125 The amino acid sequence of SEQ ID MGCVCGKPSARAADYVESPAEKGASSNSRSSSMASRRLV
391. The conserved eukaryotic APAVMDQGIDAENGHEGDYRTKLRGKQSNGADPVSLLSD
protein kinase domain is DAEKQRHSRHHQHQQHHPIRPHHLRPQGEFVPNANSNPR
underlined and the FGNPPRHIEGEQVAAGWPAWLTAVAGEAIKGWIPRRADS
serine/threonine protein kinases FEKLDKIGQGTYSNVYKARDLDTGKIVALKKVRFDNLEP
active-site signatures are in ESVRFMAREIQVLRRLDHPNVVKLEGLVTSRMSCSLYLV
bold. FEYMDHDLAGLAACPGIKFTEPQVKCYMQQLLRGLDHCH
SRGVLHRDIKGSNLLIDNGGILKIADFGLATFFHPDQRQ
PLTSRVVTLWYRPPELLLGATEYGVAVDLWSTGCILAEL
LAGKPIMPGRTEVEQLHKIFKLCGSPSEDYWKKSKLPHA
TIFKPQQPYKRCVAETFKDFPPSALALMEVLLAIEPADR
GTATSALKSDFFTTKPLACDPSSLPKYPPSKEFDAKIRD
EEARRQRAAGGRGRDAARRPSRESRAIPAPEANAELAIS
IQKRRLSSQGPSKSKSEKFNPQQEDGAVGFPIEPPRPMH
IGIDAGATSRMYSQQFGPSHSGPLSNQISSSIWGKNQKE
DEIQMAPGRPSRSSKATISDFRKPGACAPQPGADLSHLS
SLVATARSNAGIDTHKDRSGMWQHNRIDAIDGVHNNGKH
EFLEVPEHPNRQDWTRFQQPESFKGLDNYHLQDLPATHH
RKDERVASKEATMNWQGYGGQGGDKIHYSGPLLPPSGNI
DEILKEHERHIQHAVRRARQDKGRPQRSNLSQNERKAFE
HRSFVSGVNGNAGYSDLVNELPISVGSNRLKVSKTRGTE
EIVELRELEREPLSSVMEKYEREHEM
126 The amino acid sequence of SEQ ID MGCVCAKQSDILGEPESPKVKGSNLASSRWSVSSETKQL
392. The conserved eukaryotic PQHSDSGILHHQHYYHPRDESDEAKLKESNYGGSKRRTR
protein kinase domain is QGRDPADLDMGIFVRTPSSQSEAELVAAGWPAWMAAFAG
underlined and serine/threonine EAIHGWIPRRAESFEKLYKIGQGTYSNVYKARDLDNGKI
protein kinases active-site VALKKVRFDSLDAESVRFMAREILVLRKLDHPNIVKLEG
signatures is in bold. LVTSEVSSSLYLVFEYMEHDLAGLAACPGIKFTEPQVKC
YMQQLLQGLDHCHRHGVLHRDIKGSNLLIDNGGILKIAD
FGLATFFYPDQKQLLTSRVVTLWYRPPELLLGATDYGVA
VDIWSAGCILAELLAGKPILPGRTEVEQLHKIFKLCGSP
SEDYWKESKLPHATIFKPQHPYKSCIAEAFKDFSPSALA
LLETLLAIEPGHRGEASGALKSEFFTTEPLSCDPSSLPK
YPPSKEFDAKLRAQETRRQRDVGVRGHGSEAARRTSRLS
RAGPTPNEGAELTALTQKQHSTSHATSNIGSEKPSTKKE
DYTAGLHIDPPRPVNHSYETTGVSRAYDAIRGVAYSGPL
SQTHVSGSTSGKKPKRDHVKGLSGQSSLQPSKPFIVSDS
RSERIYEKSHVTDLSNHSRLAVGRNRDTTDPHKSLSTLM
QQIQDGTLDGIDIGTHEYARAPVSSTKQKSAQLQRPSAL
KYVDNVQLQNTRVGSRQSDERPANKESDMVSHRQGQRIH
CSGPLLHPSANIEDLLQKHEQQIQQAVRRAHHGKREALS
NKSSLPGKKPVDHRAWVSSGKGNKESPYFKGKGNKELSD
LKGGPTAKVTNFRQKVM
127 The amino acid sequence of SEQ ID MAVANPGQLNLQEAPSWGSRSVNCFEKLEQIGEGTYGQV
393. The conserved protein kinase YMAKEIETGEIVALKKIRMDNEREGFPITAIREIKLLKK
family domain is underlined. The LQHENVIKLKEIVTSPGPEKDEQGKSDGNKYNGSIYMVF
protein kinases ATP-binding region EYMDHDLTGLAERPGMRFSVPQIKCYMKQLLIGLHYCHI
is in bold and the NQ DNNGILKLADFGLARSFCSDQNGN
serine/threonine protein kinases LTNRVITLWYRPPELLLGSTKYGPAVDMWSVGCIFAELL
active-site signature is in YGKPILPGKNEPEQLTKIFELCGSPDESNWPGVSKLPWY
bold/italics. SNFKPQRQMKRRVRESFKNFDRHALDLVEKMLTLDPSQR
ISAKDALDAEYFWTDPVPCAPSSLPRYEPSHDFQTKRKR
QQQRQHDEMTKRQKISQHPPQQHVRLPPIQNAGQGHLPL
RPGPNPTMHNPPPQFPVGPSHYTGGPRGAGGQNRHPQNI
RPLHAAQGGGYNANRGYGGPPQQQGGGYPPHGMGNQGPR
GGQFGGRGAGYSQGGPYGGPVGGRGPNVGGGNRGPQFWS
EQ
128 The amino acid sequence of SEQ ID MQNMEDNVQSSWSLHGNKEICARYEILERVGSGTYSDVY
394. The conserved eukaryotic RGRRKADGLIVALKEVHDYQSSWREIEALQRLCGCPNVV
protein kinase domain is RLYEWFWRENEDAVLVLEFLPSDLYSVIKSGKNKGENGI
underlined and the PEAEVKAWMIQILQGLADCHANWVIHRDLKPSNLLISAD
serine/threonine protein kinases GILKLADFGQARILEEPEAIYEVEYELPQEDIVADAPGE
active-site signature is in bold. RLMEEDDSVKGVRNEGEEDSSTAVETNFGDMAETANLDL
SWKNEGDMVMQGFTSGVGTRWYRAPELLYGATIYGKEID
LWSLGCILGELLILEPLFSGTSDIDQLSRLVKVLGTPTE
ENWPGCSNLPDYRKLCFPGDGSPVGLKNHVPSCSDSVFS
ILERLVCYDPAARLNAKEVLENKYFVEDPYPVLTHELRV
PSPLREENNFSEDWAKWKDMEADSDLENIDEFNVVHSSD
GFCIKFS
129 The amino acid sequence of SEQ ID MDLNQYPEDLNPELPEGTDNVDNPDNNKGSPVPSPHPPL
395. The conserved eukaryotic KPLDPSERYRKGITLGQGTYGIVYKAFDTVTNKTVAVKK
protein kinase domain is IHLGKAKEGVNVTALREIKLLKELSHPNIIQLIDAYPHK
underlined and the protein kinases QNLHIVFEFMETDLEAVIKDRNLVFSPADIKSYLQMTLK
ATP-binding region and GLAVCHKKWVLHRDMKPNNLLIAADGQLKLGDFGLARLF
serine/threonine protein kinases GSPDRKFTHQVFAVWYRAPELLFGAKQYGPAVDIWATGC
active-site signatures are in IFAELLLRKPFLQGVSDLDQIGKIFAAFGTPRQSQWPDV
bold. ASLPDFVEFQFVPAPSLRSLFPMASEDALDLLSKMFTLD
PKNRITAQQALEHRYFSSVPAPTRPDLLPKPSKVDSSRP
PKHASPDGPVVLSPSKARRVMLFPNNLAGILPKQVSQST
TGGTPIEFDMPTQKLREVCPRSRITESGKKHLKRKTMDM
SAALDECAREQEGQEGKTILDPDHQRSAKKEKHM
130 The amino acid sequence of SEQ ID MAGGQENCVRITRARAACVSKASAPVIQSQVDEKKSRKR
396. The conserved cyclin N- and APKRAAVDDLAANASGSQPKRRAVLGDVTNLHAAATDCL
C-terminal family domains are STAEDQVDAPNPSIKGRARNKKKEARTSTKVVKDEIHPE
underlined. SNPLADHSSNLSECQKPPAAKLAEQRSLRGVPSKAKQGG
SSNSQSCSKHTDIDKDHTDPQMCTTYVEDIYEYLRNAEL
KNRPSANFMETAQNDITPNMRAILVDWLVEVSEEYKLVP
DTLYLTVSYIDRYLSANPTSRHKLQLLGVSCMLIASKYE
EVCPPHVEEFCYITDNTYTRDEMLSMERKILIFLNFEMT
KPTTKSFLRRFVRASQAGNKAPSLHMEFLANYLAELTLM
ECSFLQYLPSLIAASTVFLSRLTLDFLTNPWNPTLAHYT
GYKASQLKDCVMAIYNVQMNRKGSTLVAIREKYQQHKFK
CVASLPPPPFIAERFFDTPN
131 The amino acid sequence of SEQ ID MTGTQASNVRITRARAAKSTLNNALPPLPPAQGKPRGKR
397. The conserved cyclin and AATESNISGFSVAAEPLKRRAVLSDVSNICKEAAAVDCL
cyclin C-terminal domains are KKPKAVKVVSQNANAKGRGRGIPRNNKKITQEAEIKKET
underlined and the cyclins SPAICNVDDASAGNAIGDDKQNNNVNPLKEVQDNPKELN
signature is in bold. PIAEQISVHPHCKQSVEKPNEKEIVVSDNKAAIASLKQQ
STLQSLRIPKQPKYSLKQGNPVPLANLHEDVGRSSCSDF
IDIDSEYKDPQMCTAYVTDIYANMRVVELKRRPLPNFME
TTQRDINANMRSVLIDWLVEVSEEYKLVPDTLYLTVSYI
DRFLSANVVNRQRLQLLGVSCMLVASKYEEICAPPVEEF
CYITDNTYKKEEVLEMEISVLNRLQYDLTTPTTKTFLRR
FIRAAQASCKVSSLHLEFMGNYLAELTLVEYDFLKYLPS
LIAAAAVFVARMTLDPMVHPWNSTLQHYTGYKVSDMRDC
ICAIHDLQLNRKGCTLAAIREKYNQPKFKCVANLFPPPI
ISPQFLIDNEV
132 The amino acid sequence of SEQ ID MAAPNQNALLINNNNRRPLVDIGNLVGALNAQCNISKNG
398. The conserved cyclin and ARKRAFGDIGNLVEDLDAKCTISKYWVRKRPRTNFGVNA
cyclin C-terminal domains are NKGASSSTQGQGIVVRGEQKAWDRIVWGNKQSCAIKMNA
underlined and the cyclins QHVTATQRGTAISISDIIDSSVQDGGIKAPSQLKARKQT
signature is in bold. VRTVTATLTARSEDSLRDVLEVPPGIDDGDRDNPLAVVE
YVEDIYHFYRKIEVRSCVPPDYMTRQLEIKDSMRGVIID
WLIEVHRTFLLMPETLYLTVNIIDRYLSIQSVTRNELQL
MGITAMFIASKYEEISPPKINDLVYITKDAYTSKQIVNM
EHTILNRLKFKLTVPTPYVFLVRFLKAAGPDKVMKNLAF
FLVDLCLLHYKMIKYSPSMLAAAAVYTAQCTLKKHPYWN
KTLILHIGYSEAHLRECAHLMADLHLKAEGSNLKSVYKK
YSYPIFGSVAFLSPAKIPAGTVAAPAIDKCAHQIYLRNLR
133 The amino acid sequence of SEQ ID MFPNKQTQGLVQNKKMASKAAQPKAMVPPQRVPPAANNR
399. The conserved cyclin N- and RALGDIGNIVADVGGKCNVTKDGVNGKPLAQVSRPITRS
C-terminal family domains are FGAQLLAQAAANKGISAANNQTQVPVVIPKADVRGNKQR
underlined. RTSKSKDIPPTTVVTNESDDCVIIEQAQRIKPTCNHNVG
AVGNKEKPQLLTAKPKSLTASLTSRSAVALRGFRFDDEM
TEAEEDPLPNIDVGDRDNQLAVVEYVEDIYKFYRRTEQM
SCVPDYMPRQQEINPKMRAVLINWLIEVHYRFGLMPETL
YLTTNLIDRYLATQLVSRSNYQLVGATAMLLASKYEEIW
APEMNDFLDILENKFERKHVLVMEKAMLNKLKFHLTVPT
PYVFLVRFLKAAASDEEMENLVFFLMELSLMQYVMIKFP
PSMLAAAAVYTAQITLKKTTVWNDVLKRHTGYSEIDLKE
CTRLMVAFHQSSEESKLNVVFKKYSMPEYDSVALIKPAK
LPA
134 The amino acid sequence of SEQ ID MAPSFDCVANAYIESCEDQEKLRQNAQILAQSGENDVDE
400. The conserved cyclin and PVSMLVQRETHYMLPEDYLQRLRNRTLDVNVRREAVGWI
cyclin C-terminal domains are LKVHSFYNFSAPTAYLAVNYLDRFLSRHRMPQGVKAWMI
underlined QLMAVACLSLAAKMEETQVPLPSDLQREDARFIFDARTI
QRMELLILSTLQWGMRSITPFSFIDYFAYRAVQGHGHGH
DATPKAVMSRAIELILSTTEEIDFMEYRPSAIAAAALLC
AAEEVVPLQAVHYKRALSSSITDVDKDKMFGCYNLIQET
IIEGGCYWTPMSLQSTEKTPVGVLDAAACLSNTPTSSYS
VKPYASVTAAKRRKLNEICSALLVSQAHPC
135 The amino acid sequence of SEQ ID MAANFWTSSHCKELLDAEKVGIVHPLDKDQGLTQEDVKI
401. The conserved cyclin and IKINMSNCIRTLAQYVKLRQRVVATAITYCRRVYTRKSF
cyclin C-terminal domains are TEYDPQLVAPTCLYLASKAEESTVQAKLVIFYMKKYSKH
underlined. RYEIKDMLEMEMKLLEALDYYLVIYHPYRPLIQFLQDAG
LNDLKVTAWALVNDTYRTDLILTYPPYMIALACIYFACI
MEEKDAQAWFEELRVDMNEIKNISMEIVDYYDNYRVIPD
EKMNSALNKLPHRF
136 The amino acid sequence of SEQ ID MAPALSSSYECLSHLLCAEDASNVVGCWDEDESKIFCEE
402. The conserved cyclin domain EEGFGIQHFPDFPVPDDDEIRVLVRKESQYMPGKSYVQS
is underlined. YQNLGLDFTARQNAIGWILKVHGSYNFGPLTAYLSINYL
DRFLSRNPLPKAKVWMLQLLSVACLSLAAKMEETQVPLL
LDLQAEEPDFLFEPRTIQRMELLVLSTLEWRMLSVTPFS
FVDYFLQGGGGRKPPPRAMVARANELIFNTHTVLDFLEH
RPSAIAAAAVICAAEEVLPLEAAQYKETILSCSLVDKEW
VFGSYNLIQEVLIEKFSTPKKAKSASSSIPQSPVGVLDA
FCLSNNSNNTSLEASLSVNLYASVAAKRRKLNDYCNTWR
MFQHSTC
137 The amino acid sequence of SEQ ID MAPNCIDCAPSDLFCAEDAFGVVEWGDAETGSLYGDEDQ
403. The conserved cyclin domain LHYNLDICDQHDEHLWDDGELVAFAEKETLYVPNPVEKN
is underlined. SAEAKARQDAVDWILKVHAHYGFGPVTAVLSINYLDRFL
SANQLQQDKPWMTQLAAVACLSLAAKMDETEVPLLLDFQ
VEEAKYIFESRTIQRMELLVLSTLEWRMSPVTPLSYIDH
ASRMIGLENHHCWIFTMRCKEILLNTLRDAKFLGLLPSV
VAAAIMLHVIKETELVNPCEYENRLLSAMKVNKDMCERC
IGLLIAPESSSLGSFSLGLKRKSSTINIPVPGSPDGVLD
ATFSCSSSSCGSGQSTPGSYDSNNSSILCISPAVIKKRK
LNYEFCSDLHCLED
138 The amino acid sequence of SEQ ID MPQIQYSEKYTDDTYEYRHVVLPPETAKLLPKNRLLNEN
404. The conserved cyclin- EWRAIGVQQSRGWVHYAIHRPEPHIMLFRRPLNYQQNQQ
dependent kinases regulatory QQAGAQSQPMGLKAQ
subunit domain is underlined and
the cyclin-dependent kinases
regulatory subunits signature 1 is
in bold.
139 The amino acid sequence of SEQ ID MDQIEYSEKYYDDTYEYRHVELPPDVARLLPKNRLLTEN
405. The conserved cyclin- EWRGIGVQQSRGWVHYAIHCSEPHIMLFRRPLNYEQNHQ
dependent kinases regulatory HPEPHIMLFRRPLNCQPNHQPQAHHPT
subunit domain is underlined and
the cyclin-dependent kinases
regulatory subunits signature 1 is
in bold.
140 The amino acid sequence of SEQ ID MDQIEYSEKYYDDTYEYRHVELPPDVARLLPKNRLLTEN
406. The conserved cyclin- EWRGIGVQQSRGWVHYAIHCSEPHIMLFRRPLNYEQNHQ
dependent kinases regulatory HPEPHIMLFRRPLNCQPNHQPQAHHPT
subunit domain is underlined and
the cyclin-dependent kinases
regulatory subunits signature 1 is
in bold.
141 The amino acid sequence of SEQ ID MPQIQYSEKYYDDTYEYRHVVLPPDVARLLPKNRLLNEN
407. The conserved cyclin- EWRGIGVQQSRGWVHYAIHRPEPHIMLFRRHLNYQQNQQ
dependent kinases regulatory QQAQQQPAQAMGLQA
subunit domain is underlined and
the cyclin-dependent kinases
regulatory subunits signature 1 is
in bold.
142 The amino acid sequence of SEQ ID MALVETEPVTLIHPEEPKKFKKKPTPGRGGVISHGLTEE
408. The conserved GCN5-related N- EARVKAIAEIVGAMVEGCRKGEDVDLNALKAAACRRYGL
acetyltransferase family domain is SRAPKLVEMIAALPDGERAAVLPKLKAKPVRTASGIAVV
underlined and the radical SAM AVMSKPHRCPHIATTGNICVYCPGGPDSDFEYSTQSYTG
family domain is in bold. YEPTSMRAIRARYNPYVQTRSRIDQLKRLGHTVDKVEFI
LMGGTFMSLPADYRDYFIRNLHDALSGHTSSNVEEAVCY
SEHSATKCIGLTIETRPDYCLGPHLRQMLSYGCTRLEIG
VQSTYEDVARDTNRGHTVAAVADCFCLAKDAGFKVVAHM
MPDLPNVGVERDMESFREFFENPAFRADGLKIYPTLVIR
GTGLYELWKTGRYRNYPPEQLVDIIARVLALVPPWTRVY
RVQRDIPMPLVTSGVEKGNLRELALARMDDLGLKCRDVR
TREAGIQDIHHKIRPEVVELVRRDYCANEGWETFLSYED
TRQDILVGLLRLRKCGHNTTCPELKGRCSIVRELHVYGT
AVPVHGRDADKLQHQGYGTLLMEQAERIAWKEHRSIKIA
VISGVGTRHYYRKLGYELEGPYMMKYLN
143 The amino acid sequence of SEQ ID MLGFRDLYTSICEHLQRASGRLPIIAAATSLISTPEIAA
409. The conserved chromo domain VEKENKAPNSVDKMGMGSADESGRFSTSNGQFMNMNNGV
is underlined and the MOZ/SAS-like VKEEWKGGVPVVPSAPTTVPVITNVKLETPSSPDHDMAR
protein domain is in bold. KRKLGFLPLEVGTRVLCKWRDGKFHPVKIIERRKLPNGA
TNDYEYYVHYTEFNRRLDEWVKLEQLELDSVETDADEKV
DDKAGSLKMTRHQKRKIDETHVEGNEELDAASLREHEEF
TKVKNITKIELGRYEIETWYFSPFPSEYNNCEKLYFCEF
CLNFMKRKEQLQRHMRKCDLKHPPGDEIYRSGTLSMFEV
DGKKNKVYAQNLCYLAKLFLDHKTLYYDVDLFLFYILCE
CDERGCHMVGYFSKEKHSEESYNLACILTLPPYQRKGYG
KFLISFSYELSKKEGKVGTPERPLSDLGLLSYRGYWTRV
LLDILKKHKSNISIKELSDMTAIKADDVLSTLQGLDLIQ
YRKGQHAICADPKVLDRHLKAVGRGGLEVDVCKLIWTPY
KEQ
144 The amino acid sequence of SEQ ID MGSLDESTCSEEIRDEGKDSIRTKFKVESTVNNAQNGGN
410. The conserved MOZ/SAS-like DNSKKKRAAGLPLEVGIRLLCKWRDSKLHPVKIIERRKL
protein domain is underlined. PNGFPQDYEYYVHYTEFNRRLDEWVKLEQFELDSVETDA
DEKIEDKGGSLKMTRHQKRKIDEIHVEEGQGHEDFDPAS
LREHEEFTKVKNIAKVELGRYEIETWYFSPFPPEYSHCE
KLFFCEFCLNFMKRKEQLQRHMRKCDLKHPPGDEIYRNG
TLSMFEVDGKKNKIYGQNLCYLAKLFLDHKTLYYDVDLF
LFYVLCECDDRGCHVVGYFSKEKHSDEAYNLACILTLPP
YQRKGYGKFLIAFSYELSKKEGKVGTPERPLSDLGLLSY
RGYWTRILLDILKKQRGNISIKELSDMTAIKVEDVISTL
QVLDLIQYRKGQHVICADPKVLDRHLKAAGIAGLEVDVS
KLIWTPYKEQCG
145 The amino acid sequence of SEQ ID MASAPMVGCDDSRDKHRWVESKVYMRKGHGKGSKGNAGF
411. The conserved bromo family NAQNSTAQVRRENDNMGNSIADNGKSEAASEGLSSLSRK
domain is underlined. QITVNQDHPPNETSSMPAVGGLQNIDTHVTFKLEGCSKQ
EIWELRKKLTNELEQVRGTFKKLEARELQLRGYSVSAGV
NTSYSASQFSGNDMRNNGGKEVTSEVASGGAITPKQAQR
ESNPPRQLSISLMENNQAASDMGEKGKRTPKANQYYRNS
EFVLGKDKFPPAESKKSKSTGNKKISQSKVFSKETMQVG
KEFMPQKSVNEVFKQCSLLLTKLMKHKYGWVFNLPVDAQ
ALGLHDYHTIIKRPMDLGTVKSKLEKNLYNSPASFAEDV
KLTFSNAMTYNPKGHEVHTMAEQLLQLFEERWKTIYEEH
LDGKMRFGSGQGLGASSSTKKLPFQDSKKNIKKSEPAGG
PSPPKPKSTNHHASRTPSAKKPKAKDPHKRDMTYEEKQK
LSTNLQNLPQERLELIVQIIKKRNPSLCQHDEEIEVDID
SFDTETLWELDRFVTNYKKSLSKNKKKALLADQAKRASE
HGSARNKHPMIGRELPMNNKKGEQGEKVVEIDHMPPVNP
PVVEVEKDGVYAKRSSSSSSSSSDSGSSSSDSDSGSSSG
SESDAYAATSPPAGSNTSARG
146 The amino acid sequence of SEQ ID MEGHSGALGFGQGFSRSSQSPNLSPSPSHSASASVTSSG
412. The conserved GCN5-related N- QKRKRNEVEHAGVASNSTGMFAVPPSHIYSHLHPMSMSM
acetyltransferase family domain is PMPMHNSHPSSLSESRDGALTSNDDDDNLTGGNQSQLDS
underlined and the bromodomain is MSAGNTDGREDFDDEDDDDDDEEDDDEVEGDEEDQDHDP
in bold. DADDDSDDGHDSMRTFTAARLDNGAPNSRNLKPKADAAG
VAIAPTVKTEPILDTVKEEKVSGNNNNNSVSANNAQVAP
SGSAVLLSAVKEEANKPTSTDHIQTSGAYCAREESLKRE
EDADRLKFVCFGNDGIDQHMIWLIGLKNIFARQLPNMPK
EYIVRLVMDRSHKSVMIIKQNQVVGGITYRPYLSQKFGE
IAFCAITADEQVKGYGTRLMNHLKQHARDVDGLTHFLTY
ADNNAVGYFIKQDFTKEIKLEKERWHGYIKDYDGGILME
CKIDPKLPYTDLPAMIRWQRQTIDEKIRELSNCHIVYSG
IDIQKKEAGIPRKPIKVEDIPGLKEAGWTTDQWGHSRFR
LLNSPSEGLPNRQVLHAFMRSLHKAMVEHADAWPFKEPV
DPRDVPDYYDIIKDPMDVKRMFTNARTYNTHETIYYKCA
NR
147 The amino acid sequence of SEQ ID MEESGNSLTSGPDGSKRRVSYFYDSDIGNYYYSQGHPMK
413. The conserved histone PHRIRMAHSLIVHYALDEKMEVCRPNLLQSRELRVFHAD
deacetylase family domain is DYISFLQSVTPETQHEQLRQLKRFNVGEDCPVFDGLYNF
underlined. CQTYAGGSVGAAIKLNNKEADIAINWSGGLHHAKKCEAS
GFCYVNDIVLAILELLKVHQRVLYIDIDIHHGDGVEEAF
YSTDRVMSVSFHKFGDYFPGTGHLKDVGYGKGKYYSLNV
PLNDGIDDESYKNLFRPIIQKVMEIYQPEAVVLQCGADS
LSGDRLGCFNLSVKGHADCVRFLRSFNVPLVLVGGGGYT
IRNVARCWCYETAVAVGVEPQDKLPYNEYYEYFGPDYTL
HVAPSNMENQNSAKELAKIRNTLLEQLKRIQHVPSVPFQ
ERPPDTKFPEEDEEDYEKRPKGHKWGGEYFGSESDEEQK
PQNRDIDISDKPGIRRQSPPNVEAAKKIKVEEEDGDIGI
VNENDGAKWPLGEAG
148 The amino acid sequence of SEQ ID MEESGNSLTSGPDGSKRRVSYFYDSDIGNYYYSQGHPMK
414. The conserve histone PHRIRMAHSLIVHYALDEKMEVCRPNLLQSRELRVFHAD
deacetylase domain is underlined. DYISFLQSVTPETQHEQLRQLKRFNVGEDCPVFDGLYNF
CQTYAGGSVGAAIKLNNKEADIAINWSGGLHHAKKCEAS
GFCYVNDIVLAILELLKVHQRVYIDIDIHHGDGVEEAF
YSTDRVMSVSFHKFGDYFPGTGHLKDVGYGKGKYYSLNV
PLNDGIDDESYKNLFRPIIQKVMEIYQPEAVVLQCGADS
LSGDRLGCFNLSVKGHADCVRFLRSFNVPLVLVGGGGYT
IRNVARCWCYETAVGVEVEPQDKLPYNEYYEYFGPDYTL
HVAPSNMENQNSAKELAKIRNTLLEQLKRIQHVPSVPFQ
ERPPDTKFPEEDEEDYEKRPKGHKWGGEYFGSESDEEQK
PQNRDIDISDKPGIRRQSPPNVEAAKKIKVEEEDGDIGI
VNENDGAKWPLGEAG
149 The amino acid sequence of SEQ ID MMETGGNSLPSGPDGVKRKVAYFYDPEVGNYYYGQGHPM
416. The conserved histone KPHRIRMTHALLVQYGLHKEMQILKPYPARDRDLCRFHA
deacetylase family domain is DDYVAFLRGITPETIQDQVKALKRFNVGDDCPVFDGLYQ
underlined. YCQTYAGGSVGGAVKLNHKLCDIAINWAGGLHHAKKCEA
SGFCYVNDIVLAILELLKYHKRVLYVDIDIHHGDGVEEA
FYTTDRVMTVSFHKFGDYFPGTGDIRDIGCGKGKYYAVN
VPLDDGIDDESFQSLFKPIIQQVMLVYNPEAIVLQCGAD
SLSGDRLGCFNLSVKGHAECVRYMRSFNVPLLMVGGGGY
TVRNVARCWCYETGVAVGVEIDDKMPQHEYYEYFGPDYT
VHVAPSNMENKNTKQYLDKIRSKILENINSLPCAPSAQF
QVQPPDTDFPELEEEDYDERTRSHKWDGASCDSDSENGD
LKHRNHDVEESAFPRHNLANISYNTKIKLEGVGTGGLDM
AAGTDTKKNDESFEAMDYESGEELRQDHFASTINASQPC
DPALLTGVQNQLQSTDTVKPIEQSGNAPGIPPPSVATVS
TGTRPSSISRTSSLNSMSSVKQGSILGPNPPQGLNASGL
QFPVPTSNSPIRQGGSYSITVQAPDKQGLQNHMKGPQNM
PGNS
150 The amino acid sequence of SEQ ID MPPKDRVAYFYDGDVGSVYFGPNHPMKPHRLCMTHHLVL
417. The conserved histone SYELHKKMEIYRPHKAYPVELAQFHSADYVEFLHRITPD
deacetylase family domain is TQHLFTKELVKYNMGEDCPVFENLFEFCQIYAGGTIDAA
underlined. HRLNNQICDIAINWSGGLHHAKKCEASGFCYINDLVLGI
LELLKHHARVLYVDIDVHHGDGVEEAFYFTDRVMTVSFH
KYGDMFFPGTGDVKEVGEREGKYYAINVPLKDGIDDASF
TRLFKTIITKVVDIYQPGAIVLQCGADSLAGDRLGCFNL
SIDGHAQCVRIVKKFNLPLLVTGGGGYTKENVARCWSVE
TGVLLDTELPNEIPDNDYIKYFAPDYSLKINTAGNMENL
NSKTYLSAIKVQVMENLRAIQHAPSVQMHEVPPDFYIPD
IDEDELNPDERMDQHTQDRQIQRDDEYYDGDNDIDHDME
EAS
151 The amino acid sequence of SEQ ID MDSSKSEEANILHVFWHEGMLNHDLGTGVFDTLEDPGFL
418. The conserved histone EVLEKHPENADRVRNMLSILRKGPIAPYTEWHTGRAAYL
deacetylase family domain is SELYSFHRPDYVDMLAKTSTAGGKTLCHGTRLNPGSWEA
underlined. ALLAAGTTLEAMRYILDGHGKLSYALVRPPGHHAQPTQA
DGYCFLNNAGLAVELAVASGCKRVAVVDIDVHYGNGTAE
GFYERDDVLTISLHMNHGSWGPSHPQTGFHDEVGRGKGL
GFNLNVPLPNGTGDKGYEHAMHELVVPAISKFMPEMIVL
VIGQDSSAFDPNGRECLTMEGYRKIGQIMRQQADQFSGG
RLVVVQEGGYHITYAAYCLHATLEGVLCLPHPLLSDPIA
YYPEHDIYSERVTFIKNYWQGGIISTTDKRN
152 The amino acid sequence of SEQ ID MEESGNALVSGPDGSKRRVTYFYDADIGNYYYGQGHPMK
419. The conserved histone PHRMRMAHNLIVHYGLHQRMEVCRPHLAQSKDIRAFHTD
deacetylase family domain is DYIHFLSSVAPDTQQEQLRQLKRFNVGEDCPVFDGLFNF
underlined. CQSSAGGSIGAALKLNRKDADIAINWAGGLHHAKKCEAS
GFCYVNDIVLGILELLKVHQRVLYIDIDIHHGDGVEEAF
YTTDRVMTVSFHKFGDYFPGTGHIKDVGYGKGKYYALNV
PLNDGIDDESYKHLFRPIIQKVMEVYQPEAVVLQCGADS
LSGDRLGCFNLSVKGHADCVRFVRSFNIPLMLVGGGGYT
IRNVARCWCYETAVAVGVEPQDKLPYNEYYEYFGPDYTL
YVAPSNMENLNTEKDLEKMRNVLLEQLSKIQHTPSVPFQ
ERPPDTEFNDEEEEDMEKRSKCRIWDGEYVGSEPEEDGK
LPRFDADTYERSVLKHENKRLVPVSNVEPLKRIKQEEDG
AAV
153 The amino acid sequence of SEQ ID MDLNLVSHGEEEEGVRRRKVGIVYDERMCKHATPEDQPH
421. The conserved histone PEQPDRIRVIWDKLNSAGVLHKCVMVEAKEASEEQLAGV
deacetylase family domain is HSRKHIEVMKSIGTARYNKKKRDKLAASYSSIYFSQGSS
underlined. EAALLAAGSVVEISEKVASGELDAGVAIVRPPGHHAEAD
KAMGFCLFNNIAIAAKHLVHERPELGVQKVLIVDWDVHH
GNGTQHMFWTDPHVLYFSVHRFDAGTFYPGGDDGFYDKI
GEGKGAGYNINVPWEQGKCGDADYLAVWDHVLVPVAKSY
DPDMVLISGGFDAALGDPLGGCRLTPYGYSLMTKKLMEF
AGGKIVLALEGGYNLKSLADSFLACVEALLKDGPGRSSV
LTHPFGSTWRVIQAVRKELSSFWPALNEELQLPRLLKDA
SESFDKLSSSSSDESSASEDEKKFAEVTSIMEVSPDPSS
ILALTAEDIAQPLAGLKIEEAGTDSQRSSDHTLLDLTND
DTQKLKQFEGEIFVMIGDEESVPSASSSKDQNESTVVLS
KSNIKAHSWRLTFSSIYVWYASYGSNMWNPRFLCYIEGG
QVEGMAKRCCGSEDKLLLKGYSGKLFLIECFLGDHTQIH
GVQEECPFLIQIVVIRVKRMSACIK
154 The amino acid sequence of SEQ ID MADEDLDLSDVGEVEDEPGEEIESTPPLAVGQEKEINSL
422. The conserved FKBP-type ALKKKLLKVGTRWETPENGDEVTVHYTGTLPDGTKFDSS
peptidyl-prolyl cis-trans RDRGEPFTFKLGQGQVIKGWDQGIVTMKKGERALFTIPP
isomerase signature is underlined ELAYGSSGVRPTIPPNATLQFDVELLSWTNIVDVCNDGG
and the FKBP-type peptidyl-prolyl ILKRIISEGEKYERPKDPDEVTVKYEAKLEDGTLVAKSP
cis-trans isomerase signatures 1 EEGVEFYVNDGHFCPAIAKAVKTMKRGESVILTIKPTYA
and 2 are in bold. FGERGKDAEEGFAAIPPNATLTTSLELVSFKAVIAVTED
KKVIKKILKEADGYDKPSDGTVVQIRYTAKLQDGTIFEK
KGYEGEEPFQFVVDEEQVIAGLDKAVETMKTGEIALITI
GAEYGFGNFETQRDLAVIPPNSTLIYEVEMISFTKEKES
WDMDTTEKIEASKQKKEQGNSLFKVGKYQRAAKKYEKAA
KYIEHDSSFSAEEKKQSKVLKVSCNLNHAACRLKLKDFK
EAVKLCSKVLELESQNVKALYRRAQAYIETADLDLAEFD
IKKALEIEPQNREVQLEYKILKQKQIEYNKKDAKLYGNM
FAKLNKLEAFEGKVLS
155 The amino acid sequence of SEQ ID MADEGLELSDVAEVEDEPGEEFESAPPLVVGQEKELNSS
423. The conserved FKBP-type GLKKKLLKAGTRCETPENGDEVTVHYTGTLLDGTKFDSS
peptidyl-prolyl cis-trans RDRGEPFTFNIGQGQVIKGWDQGIVTMKKREHALFTIPP
isomerase family domains are ELAYGASGMPPTIPPNATLQFDVELLSWTNIVDVCKDGG
underlined. The FKBP-type ILKRIISDGEKYERPKDPDEVTVKYEAKLEDGMLVAKSP
peptidyl-prolyl cis-trans EEGVEFYVNDGNFCPAIVKAVKTMKKGENVTLTIKPAYA
isomerase signatures 1 and 2 are FGEQGKDAEEGFAAIPPNATITINLQLVSFKAVKEVTED
in bold. The TPR repeat is in KKVIKKILKEADGYDKPSDGTVVQIRYTAKLQDGTIFEK
bold/italics. KGYAGEEPFQFVVDEEQVIAGLDKAVETMKTGEVALITI
GPEYGFGNIETQRDLAVIPPYSTLIYEVEMVSFTKEKES
WDMNTTENIEASKQKKEQGNSLFKVGKYLRAAKKYDKAA
KYIEHDNSFSAEEKKQSKVLKVSCNLNHAACCLKLKDFK
KAVKLCSKVLELESQN
REVRLEYLILKQKQIEYNKKDAKLYGNM
FARQNKLEAIEGKD
156 The amino acid sequence of SEQ ID MPNPKVFFDMQVGGAPAGRIVMELYADVVPKTAENFRAL
424. The conserved cyclophilin- CTGEKGTGRSGKPLHFKGSSFHRVIPGFMCQGGDFTRGN
type peptidyl-prolyl cis-trans GTGGESIYGEKFADENFVKKHTGPGILSMANAGPNTNGS
isomerase signature is underlined QFFICTAQTSWLDGKHVVFGQVVEGLEVVRDIEKVGSGS
and the cyclophilin-type peptidyl- GRTSKPVVIADSGQLA
prolyl cis-trans isomerase
signature 2 is in bold.
157 The amino acid sequence of SEQ ID MPNPKVFFDMQVGGAPAGRIVMELYADVVPKTAENFRAL
425. The conserved cyclophilin- CTGEKGNGRSGKPLHFKGSSFHRVIPGFMCQGGDFTRGN
type peptidyl-prolyl cis-trans GTGGESIYGEKFADENFVKKHTGPGILSMANAGPNTNGS
isomerase signature is underlined QFFICTAQTSWLDGKHVVFGQVVEGLEVVRDIEKVGSGS
and The cyclophilin-type peptidyl- GRTSKPVVIADSGQLA
prolyl cis-trans isomerase
signature 2 is in bold.
158 The amino acid sequence of SEQ ID MPNPKVFFDMQVGGAPAGRIVMELYADVVPKTAENFRAL
426. The conserved cyclophilin- CTGEKGTGRSGKPLHFKGSSFHRVIPGFMCQGGDFTRGN
type peptidyl-prolyl cis-trans GTGGESIYGEKFADENFVKKHTGPGILSMANAGPNTNGS
isomerase signature is underlined QFFICTAQTSWLDGKHVVFGQVVEGLEVVRDIEKVGSGS
and The cyclophilin-type peptidyl- GRTSKPVVIADSGQLA
prolyl cis-trans isomerase
signature 2 is in bold.
159 The amino acid sequence of SEQ ID MPNPKVFFDMQVGGAPAGRIVMELYADVVPKTAENFRAL
427. The conserved cyclophilin- CTGEKGTGRSGKPLHFKGSSFHRVIPGFMCQGGDFTRGN
type peptidyl-prolyl cis-trans GTGGESIYGEKFADENFVKKHTGPGILSMANAGPNTNGS
isomerase signature is underlined QFFICTAQTSWLDGKHVVFGQVVEGLEVVRDIEKVGSGS
and The cyclophilin-type peptidyl- GRTSKPVVIADSGQLA
prolyl cis-trans isomerase
signature 2 is in bold.
160 The amino acid sequence of SEQ ID MADDFELPESAGMMENEDFGDTVFKVGEEKEIGKQGLKK
428. The conserved FKBP-type LLVKEGGSWETPETGDEVEVHYTGTLLDGTKFDSSRDRG
peptidyl-prolyl cis-trans TPFKFKLGQGQVIKGWDQGIATMKKGENAVFTIPPDLAY
isomerase signature is underlined GESGSQPTIPPNATLKFDVELLSWASVKDICKDGGIFKK
and the FKBP-type peptidyl-prolyl IIKEGEKWEHPKEADEVLVKYEARLEDGTVVSKSEEGVE
cis-trans isomerase signature 1 is FYVKDGYFCPAFAIAVKTMKKGEKVLLTVKPQYGFGHQG
in bold and underlined. The TPR REAIGNDVARSTNATLLVDLELVSWKVVDEVTDDKKVLK
repeat is in bold/italics. KILKQGEGYERPNDGAVVKVKYTGKLEDGTIFEEKGSDE
EPFEFMAGEEQVVDGLDRAVMTMKKGEVALVSVAAEYGY
QTEIKTDLAVVPPKSTLIYEVELVSFVKEKESWDMNTAE
KIEAAGKKKEEGNALFKVGKYFRASKKYEKATKYIEYDT
SFSEEEKKQSKPLK
RDVKLEYRALKEKQKEYNKKEAKFYGNMFARMSKL
EELESRKSGSQKVETANKEEGSDAMAVDGESA
161 The amino acid sequence of SEQ ID MAASLTPLGAGLAYATIYDQAKVRKLEPTKRSLIALCQH
429. The conserved FKBP-type SDSQHRRFITRKYHVNVQILNRRDAIRLIGLAAGLCIDL
peptidylprolyl isomerase domain is SLMYDARGAGLPPQENAKLCDTTCEKELENAPMITTESG
underlined. LQYKDIKIGNGPSPPIGFQVAANYVAMVPSGQVFDSSLD
KGQPYIFRVGSGQVIKGLDEGLLSMKVGGKRRLYIPGPL
AFPKGLNSAPGRPRVAPSSPVIFDVSLEFIPGLESEEE
162 The amino acid sequence of SEQ ID MSAASLSADMAIRGTILGKTALHVLGPQVVSQCRQPVMF
430. The conserved FKBP-type KCPPHTLRKMRFSAQDLQSKNFYSGFTPFKSVFISTSKR
peptidylprolyl isomerase domain is SWQAGSARAMSQDAAFQSKVTTKCFLDIEIGGDPAGRIV
underlined and the Cyclophilin- LGLFGEDVPKTAENFRALCTGEKGFGYKGSSFHRIIKDF
type peptidyl-prolyl cis-trans MLQGGDFDRGDGTGGKSIYGRTFEDENFKLAHVGPGVLS
isomerase signature is in bold. MANAGPNTNGSQFFICTVKTPWLDKRHVVFGQVIEGMEI
VKKLESEETNRTDRPKRPCRIVDCGELP
163 The amino acid sequence of SEQ ID MGRIKPQTLLQQSKKKKVPGRISVSTIIVCNLIIIFLMF
431. The conserved FKBP-type SLVGIYRQRAKRNRATSRSDGDEEMENFGRSKINSVPHQ
peptidylprolyl isomerase domain is AIVNTTKGLITLELFGKSSAHTVEKFVEWSERGYFNGLP
underlined. FYRVIKHFVIQVGDPKFAGNREDWTVGGQLNVQLEFSPK
HEAFMLGTSKLEDQGDGFELFITTAPIPDLNDKLNVFGR
VIKGQDVVQEIEEVDTDEHFQPKSPIIINDVRLKDEL
164 The amino acid sequence of SEQ ID MARQSTLLLFWSLVFLGAIVFTQAKHEELEEVTHKVYFD
432. The conserved cyclophilin- VDIAGKPAGRVVIGLFGKAVPKTVENFRALCTGEKGVGK
type peptidyl-prolyl cis-trans SGKPLHYKGSFFHRIIPSFMIQGGDFTLGDGRGGESIYG
isomerase signature is underlined TKFADENFKLKHTGPVFITTVTTDWLDGRHVVFGKIISG
and the cyclophilin-type peptidyl- MDVVYKVEAEGRQSGQPKRKVKIADSGELSMD
prolyl cis-trans isomerase
signature is in bold.
165 The amino acid sequence of SEQ ID MEMDEIQEQSQPQSSEKQDISQESDTGNDKTINAEKITS
434. The conserved FKBP-type ENAEVEEDDMLPPKVNTEVEVLHDKVTKQIIKEGSGNKP
peptidyl-prolyl cis-trans SRNSTCFLHYRAWAESTMHKFQDTWQEQQPLELVLGREK
isomerase signature is underlined KELSGFAIGVAGMKAGERALLHVDWQLGYGEEGNFSFPN
and the TPR repeat is in bold. VPPRANLIYEAELIGFEEAKEGKARSDMTVEERIEAADR
RRQQGNELFKEDKLAEAMQQYEMALAYMGDDFMFQLFGK
YKDMANAVKNPCHLNMAQCLLKLNRYEEAIGQCNMVLAE
DEKNIKALFRRGKARATLGQTDDAREDFQKVRKFSPEDK
AVIRELRLLAEHDKQVYQKQKEMFKGLFGQKPEQKPKKL
HWFVVFWQWLLSMIRTIFRMRSKTD
166 The amino acid sequence of SEQ ID MAGAGEGTPEVTLETSMGPITVELYHKHAPKTCRNFLEL
435. The conserved cyclophilin- SRRGYYNNVKFHRVIKDFMVQGGDPTGTGRGGESIYGPR
type peptidyl-prolyl cis-trans FEDEITRDLKHTGAGILSMANAGPNTNGSQFFISLAPTP
isomerase signature is underlined WLDEKHTIFGRVCKGMDVVKRLGNVQTDKNDRPIHDVKI
and the cyclophilin-type peptidyl- LRTTVKD
prolyl cis-trans isomerase
signature is in bold.
167 The amino acid sequence of SEQ ID MMDPELMRLAQEQMSKISPDELMKMQRQIMANPDLMRMA
436. The conserved TPR repeat SENMKNLKPEDIRFAAEQMKNVRKEEMAEISERISRASP
domain is underlined. EEIEAMKARANLQSAYQLQVAQNLKDQGNQLHARMKYSE
AAEKYLQARNNLTGIPFSEAKSLLLASSSNLMSCYLKTG
QYEECVQTGSEVLAYDAMNVKALYRRGQAYKQIGKLELA
VADLRKAVEVSPEDETIAQALREASTELMEKGGTQDQNG
PRIEEIIEEEAVQPTAEKYPQSAPMVTSVTEDVSDDEQG
SEDQNGFSRDSFQATNAPDGQMYAESLRNLTENPDMLRT
MQSLMKNVDPDSLVALSGGKLSPDMVKTVSGMFGRMSPE
EIQNMMKMSSTLSRQNPSTSSRFDDITRGHSNMDSSPQS
VSVDNDLFEENQNRVGESSTNLSSSAAFSGMPNFSAEMQ
EQVRNQMNDPATRQMFTSMIQNMSPEMMASMSEQFGVKL
SPEDAVKAQNAMASLSPNDLDRLMNWATRLQTAIDYARK
IKNWILGRPGLIFAISMLLLAIILHRFGYIGD
168 The amino acid sequence of SEQ ID MGVEKEILRPGNGPKPRPGQSVTVHCTGYGKNEDLSQKF
437. The conserved FKBP-type WSTKDPGQKPFTFTIGQGRVIKGWDEGVLDMQLGEIFKL
peptidylprolyl isomerase domain is RCSPDYGYGSNGFPAWGIRPNSVLVFEIEVLSVN
underlined and the Cyclophilin-
type peptidyl-prolyl cis-trans
isomerase signature is in bold.
169 The amino acid sequence of SEQ ID MPNPRCYLDITIGEELEGRILVELYSDVVPKTAENFRAL
438. The conserved cyclophilin- CTGEKGIGPHTGVPLHYKGLPFHRVIKGFMIQGGDISAQ
type peptidyl-prolyl cis-trans NGTGGESIYGLKFDDENFQLKHERRGMLSMANSGPNTNG
isomerase family domain is SQFFITTTRTSHLDGKHVVFGKVIKGMGVVRGIEHTPTE
underlined and the cyclophilin- SNDRPSLDVVISDCGEIPEGSDDGIANFFKDGDLYPDWP
type peptidyl-prolyl cis-trans ADLDEKSAEISWWMNAVDSAKCFGNENYKKGDYKMALRK
isomerase signature is in bold. YRKALRYLDICWEKEEIDEEKSNHLRKTKSQIFTNSSAC
KLKLGDLKGALLDTEFAMRDGEDNVKALFRQGQAYMALK
DVDSAVASFKKALQLEPNDAGIRKELAVATKMINDRRDQ
ERRAYARMFQ
170 The amino acid sequence of SEQ ID MGDVIDLNGDGGVLKTIIRSAKPGAMQPTEDLPNVDVHY
439. The conserved FKBP-type EGTLADTGEVFDTTREDNTLFSFELGKGTVIKAWDIAVK
peptidylprolyl isomerase domain is TMKVGEVARITCKPEYAYGSAGSPPDIPENATLIFEVEL
underlined and the Cyclophilin- VACKPRKGSTFGSVSDEKARLEELKKQREIAAASKEEEK
type peptidyl-prolyl cis-trans KRREEAKATAAARVQAKLEAKKGQGRGKGKSKGK
isomerase signature is in bold.
171 The amino acid sequence of SEQ ID MGLGLKIASASFLPIFNIMATRSLCILLVCFIPVLAHVL
440. The conserved cyclophilin- SLQDPELGTVRVYFQTTYGDIEFGFFPHVAPKTVEHIYK
type peptidyl-prolyl cis-trans LVRLGCYNSNHFFRVDKGFVAQVADVVGGREVPLNSEQR
isomerase signature is underlined. KEGEKTIVGEFSEVKHVRGILSMGRYSDPDSASSSFSIL
LGNAPHLDGQYAVFGKVTKGDDTLKRLEEVPTRQEGIFV
MPLERIRILSTYYYDTNERESNLTCDHEVSILKRRLVES
AYEIEYQRRKCLP
172 The amino acid sequence of SEQ ID MASKRSLRTMNVWPTLPPLVLLLLLCFSSMSSSVVAKKS
441. The conserved FKBP-type DVSELQIGVKHKPKSCDIQAHKGDRIKVHYRGSLTDGTV
peptidylprolyl isomerase domain is FDSSFERGDPIEFELGSGQVIKGWDQGLLGMCVGEKRKL
underlined and the Cyclophilin- RIPSKLGYGAQGSPPKIPGGATLIFDTELVAVNGKGISN
type peptidyl-prolyl cis-trans DGDSDL
isomerase signatures are in bold.
173 The amino acid sequence of SEQ ID MSGAPAERPISYFDITIGGKPIGRIVFSLYADLVPKTAE
442. The conserved FKBP-type NFRALCTGEKGIGKSGKPLCYAGSGFHRVIKGFMCQGGD
peptidylprolyl isomerase domain is FTAGNGTGGESIYGEKFEDEAFPVKHTKPFLLSMANAGK
underlined and the Cyclophilin- DTNGSQFFITVSQTPHLDDKHVVFGEVIKGKSIVRAIEN
type peptidyl-prolyl cis-trans YPTASGDVPTSPIIISACGVLSPDDPSLAASEETIGDSY
isomerase signatures are in bold. EDYPEDDDSDVQNPEVALDIARKIRELGNKLFKEGQIEL
ALKKYLKSIRYLDVHPVLPDDSPPELKDSYDALLAPLLL
NSALAALRTQPADAQTAVKNATRALERLELSDADKAKAL
YRRASAHVILKQEDEAEEDLVAASQLSPEDMAISSKLKE
VKDEKKKKREKEKKAFKKMFSS
174 The amino acid sequence of SEQ ID MASSLRSSLFSSWALDSKSVCSLFNLNPGKMGLPSISTP
443. The conserved FKBP-type LNWRTCCCSHSSELLELNEGLQSSRRKTVMGLSTVIALS
peptidylprolyl isomerase domain is LVYCDEVGAVSTSKRALRSQKVPEDEYTTLPNGLKYYDL
underlined. KVGSGTEAVKGSRVAVHYVAKWKGITFMTSRQGMGITGG
TPYGFDVGASERGAVLKGLDLGVQGMRVGGQRILIVPPE
LAYGNTGIQEIPPNATLEFDVELISIKQSPFGSSVKIVEG
175 The amino acid sequence of SEQ ID MGAIEDEEPPLKRLKVSSPGLRRGLEEEAPSLSVGSVSI
444. The conserved G-protein beta LMAKSLSLEEGETVGSKGLIRRVEFVRIITQALYSLGYQ
WD-40 repeat domains are KAGALLEEESGILLQSSNVALFRKQILDGKWDESVVTLR
underlined. GIDQVEVEGNTLKAASFLILQQKFFELLDKGNIPEAMKT
LRLEISPMQLNTKRVHELASCIVFPSRCEELGYSKQGNP
KSSQRMKVLQEIQQLLPPSIMIPEKRLERLVEQALNVQR
EACIFHNSLDPALSLYTDHQCGRDQIPTTTLQVLESHKN
EVWFLQFSNNGKYLASASKDCSAIIWEITEGDSFSMKHR
LSAHQKPVSFVAWSPDDKLLLTCGIEEVVKLWNVETGEC
KLTYDKANSGFTSCGWFPDGERFISGGVDKCIYIWDLEG
KELDSWKGQGMPKISDLAVTSDGKEIISICGDNAIVMYN
LDTKTERLIEEESGITSLCVSKDSRFLLLNLANQEIHLW
DIGARSKLLLKYKSHRQSRYVIRSCFSSSDLAFVVSGSE
DSQVYIWHRGNGELLAVLPGHSGTVNCVSWNPVNPHVFA
SASDDYTIRIWGVNRNTFRSKNASSSNGVVHLANGGP
176 The amino acid sequence of SEQ ID MPGTTAGAGIEPTEPQSLKKLSLKSLKRSFDLFASLHGE
445. The conserved G-protein beta PQPPDQRSQRIRIACKVRAEYEVVKNLPTLPQREVGSSV
WD-40 repeat domains are SNSNVGETHSSLTTNQAQGFPTDTSGDLSKDEGKEITSI
underlined and the Trp-Asp (WD) AVHLQPQTGLIDGKAGAIAGTSTAISSVGSSDRYQPSAA
repeats signature is in bold. IMKRLPSKWPRPIWHPPWKNYRVISGHLGWVRSVAFDPG
NEWFCTGSADRTIKIWEVATGKLKLTLTGHIEQIRGLAV
SSRHPYLFSAGDDKQVKCWDLEYNKAIRSYHGHLSGVYC
LALHPTLDILCTGGRDSVCRVWDIRTKAQIFALSGHENT
VCSVFTQAIDPQVVTGSHDTTIKLWDLAAGKTMSTLTYH
KKSVRAIAKHPFEHTFASASADNIKKFKLPKGEFLHNML
SQQKTIVNAMAINEDNVLVSAGDNGSLWFWDWKSGHNFQ
QAQTIVQPGSLDSEAGIYALQYDITGSRLVSCEADKTIK
MWKEDETATPESHPINFKAPKDIRRF
177 The amino acid sequence of SEQ ID MRPILMKGHERPLTFLKYNRDGDLLFSCAKDHTPTVWYG
446. The conserved G-protein beta HNGERLGTYRGHNGAVWCCDVSRDSTRLITSSADQTAKL
WD-40 repeat domains are WNVETGAQLFSFNFESPARAVDLAIGDKLVVITTDPFME
underlined. LPSAIHIKRIEKDLSKQTADSVLTITGIKGRINRAVWGP
LNSTIISGGEDSVVRIWDSETGKLLRESDKETGHQKPIT
SLCKSADGSHFLTGSLDKSARLWDIRTLTLIKTYVTERP
VNAVAISPLLDHVVIGGGQEASHVTTTDREAGKFEAKFF
HKILEEEIGGVKGHFGPINSLAFNPDGRSFASGGEDGYV
RLHHFDPDYFHIKM
178 The amino acid sequence of SEQ ID MRPILMKGHERPLTFLKYNRDGDLLFSCAKDHTPTVWYG
447. The conserved G-protein beta HNGERLGTYRGHNGAVWCCDVSRDSTRLITSSADQTAKL
WD-40 repeat domains are WNVETGNQLFSFNFESPARAVDLAIGDKLVVITTDPFME
underlined. LPSAIHIKRIEKDLSKQTADSVLTITGIKGRINRAVWGP
LNSTIISGGEDSVVRIWDSETGKLLRESDKETGHQKAIT
SLCKSADGSHFLTGSLDKSARLWDIRTLTLIKTYVTERP
VNAVAISPLLDHVVIGGGQEASHVTTTDRRAGKFEAKFF
HKILEEEIGGVKGHFGPINSLAFNPDGRSFASGGEDGYV
RLHHFDPDYFHIKM
179 The amino acid sequence of SEQ ID MAENNVGDFIPLDRQEYPSKPAPGAVDSSFWKSFKKKEV
448. The conserved G-protein beta SRQIAGVTCINFCPEPPHDFAVTSSTRVHIYDGKSCELK
WD-40 repeat domains are KTITKFKDVAYSGVFRSDGQIIAAGGETGVIQVFNAKSQ
underlined. MVLRQLKGHGRPVRVVRYSPQDKLHLLSGGDDSMVKWWD
ITTQEELLNLEGHKDYVRCGAASPSSVNLWATGSYDHTV
RLWDLRNSKTVLQLKHGKPLEDVLFFPSGGLLATAGGNV
VKVWDILGGGRPIHTMETHQKTVMAMCISKVPRSGQALG
DAPSRLVTASLDGYMKVFDLDHFKVTHSARYPAPILSMG
ISSLCRTMAVGTSSGLLFIRQRKGQIEDKIHSDSSGLQV
NPVNDEKDSAVLKPNQYRYYLRGRSEKPSEGDYVVKRMA
KVYFQEYDKDLRHFNHSKALVSALKAADSKGTVAVIEEL
VARKRLIQTLSILNLDELELLINFLSRFILVPKYSRFLI
SLTDRVLDARAVDLGKSENLKKQIADLKGIVVQELRVQQ
SMQELQGIIEPLIRASAR
180 The amino acid sequence of SEQ ID MDVETSSKPTGNKRTYTRLPRQVCVFWQEGRCTRESCNF
449. The conserved C-x8-C-x5-C-x3- LHVDEPGSVKRGGATNGFAPKRSYNGSDERDTLAAGPPG
H type zinc finger is underlined GSRRNISARWGRGRGGIFISDERQKIRNKV NYWLAGN
and in italics and the conserved QRGEE KYL SFVMGSDVKFLTQLSGHVKAIRGIAFPSD
Cys and His residues in bold, The SGKLYSGGQDKKVIVWDCQTGQGTDIPLNDEVGCLMSEG
conserved G-protein beta WD-40 PWIFVGLPNAVKAWNILTSTELSLVGPRGQVHALAVGNG
repeat domains are underlined and MLFAGTHDGSILAWKFSPASNTFEPAASLVGHTQAVVSL
the Trp-Asp (WD) repeats signature VSGADRLYSGSMDKTIRVWDLGTFQCLQTLRDHTSVVMS
is in bold (non-italics). LLCWDQFLLSCSLDNTVKVWVATSSGALEVTYTHNEEHG
VLALCGMNDEQAKPVLLCSCNDNTVRLYDLPSFSERGRI
FSRNEVRTFQIAPGGLFFTGDATGELKVWNWATQKS
181 The amino acid sequence of SEQ ID MSVQELRERHAAATAKVNALRERIKAKRLQLLDTDVATY
450. The conserved G-protein beta ASSNGRTPISFSFTDLVCCRTLQGHTGKVYSLDWTSEKN
WD-40 repeat domains are RIVSASQDGRLIVWNALTSQKTHAIKLPCAWVMTCAFSP
underlined. SGQAVACGGLDSVCSIFQLNNQLDRDGHLPVSRILSGHR
SYVSSCQYVPDGDTHVITGSGDRTCIQWDVTTGQRIAIF
GGEFPLGHTADVMSVSISAANPKEFVSGSCDTTTRLWDT
RIASRAIRTFHGHEADVNTVKFFPDGLRFGSGSDDGTCR
LFDIRTGHQLQVYRQPPRENQSPTVTAIAFSFSGRLLFA
GYSNGDCFVWDTILEKVVLNLGELQNTHNGRISCLGLSA
DGSALCTGSWDKNLKIWAFGGHRKIV
182 The amino acid sequence of SEQ ID MKVKIISRSTDEFTRERSNDLQRVFRNFDPNLHTQARAQ
451. The conserved G-protein beta EYVRALNAAKLDKIFAKPFLAAMSGHIDGISAMAKSPRH
WD-40 repeat domains are LKSIFSGSVDGDIRLWDIAARRTVQQFPGHRGAVRGLTV
underlined. STEGGRLISCGDDCTVRLWDIPVAGIGESSYGSENVQKP
LATYVGKNSFRAVDYQWDSNVFATGGAQVDIWDHDRSEP
TNSFAWGSDTVISVRFNPAEKDIFATTASDRSIVLYDLR
MASPLNKLIMQTRNNAIAWNPREPMNFTAANEDCNCYSY
DMRRMNISTCVHQDHVSAVMDIDYSPSGREFVTGSYDRT
VRIFPYNAGHSREIYHTKRMQRVFCVKFSGDATYVVSGS
DDANIRLWKAKASEQLGVLLPRERKRHEYLDAVKERFKH
LPEIKRIERHRHLPKPIYKAALLRHTVNAAAKRKEERKR
AHSAPGSVVTNPLRKKRIVAQLE
183 The amino acid sequence of SEQ ID MDHYYQDDFDYLVDDEMVDFADDVEDDVRTRRRSDIDSD
452. The conserved G-protein beta SENDFDLNNKSPDTTALQAKRGKDIQGIPWNRLNFTREK
WD-40 repeat domains are YRETRLQQYKNYENLPRPRRSRNLDKECTNFERGSSFYD
underlined. FRHNTRSVKATIVHFQLRNLVWATSKHNVYLMQNYSIMH
WSSLKQKGEEVLNVAGPIVPSVKHPGSSPQGLTRVQVSA
MSVKDNLVVAGGFQGELICKYLDKPGVSFCTKISHDENG
ITNAVEIYNDASGATRLMTANNDLAVRVFDTEKFTVLER
FSFPWSVNHTSVSPDGKLVAVLGDNADCLLADCKTGKTV
GTLRGHLDYSFAAAWHPDGYILATGNQDTTCRLWDVRKL
SSSLAVLKGRMGAIRSIRFSSDGRFMAMAEPADFVHLYD
TRQNYTKSQEIDLFGEIAGISFSPDTEAFFVGVADRTYG
SLLEFNRRRMNYYLDSIL
184 The amino acid sequence of SEQ ID MAEALVLRGTMEGHTDAVTAIATPIDNSDMIVSSSRDKS
453. The conserved G-protein beta ILLWNLTKEPEKYGVPRRRLTGHSHFVQDVVISSDGQFA
WD-40 repeat domains are LSGSWDSELRLWDLNTGLTTRRFVGHTKDVLSVAFSIDN
underlined and the Trp-Asp (WD) RQIVSASRDRTIKLWNTLGECKYTIQPDAEGHSNWISCV
repeats signatures are in bold. RFSPSATNPTIVSCSWDRTVKVWNLTNCKLRNTLVGHGG
YVNTAAVSPDGSLCASGGKDGVTMLWDLAEGKRLYSLDA
GDIIYALCFSPNRYWLCAATQQCVKIWDLESKSIVADLR
PDFIPNKKAQIPYCTSLSWSADGSTLFSGYTDGKIRVWG
IGHV
185 The amino acid sequence of SEQ ID MAAIKSTSRSASVAFAPDAPLLAAGTMAGAIDLSFSSLA
454. The conserved G-protein beta NLEIFKLDFQSDDPELPVVGECPSNERLNRLSWGSAGGS
WD-40 repeat domains are FGIIAGGLVDGTINIWNPATLINSEDNGDALIARLEQHT
underlined. GPVRGLEFNTISTNLLASGAEDGELCIWDLANPTAPTHF
PPLKGVGSGAQGEISFLAWNRKVQHILASTSYSGTTVVW
DLRRQKPIISFPDATRRRCSVLQWNPDASTQLIVASDDD
NSPTLEAWDLRNTISPYKEFVGHSRGVIAMSWCPSDSLF
LLTCAKDNRTLCWDTGSGEIVCELPAGANWNFDVQWSPK
IPGILSTSSFDGKIGIHNIEACSRNVSGEVEFGGAIVRG
GPSALLKAPKWLERPAGVSFGFGGKLASFRPSTVAQAAD
HRHSEVFIHNLVTEDNLVIRSTEFEAAIADGEKVSLRAL
CDRKAEESQSDEEKETWNFLRVMFEDEGTARTKLLEHLG
FKVQSEENGDLQETHSSKIDDIGSEIGKTLTLDDKTEED
VLPQLKGGQDAAIPQDNGEDFFDNLHSPKEEVSLSHVGN
DFVGEKDKDMVVNGAEIEHETEDLTEYSDWNEAIQHSLV
VGDYKGAVLQCLSANRMADALIIAHLGGNSLWEKTRDEY
LKKAKSSYLKVVSAMVNNDLTGLVNSRPLKSWKETLAML
CTYSQREEWTVLCDMLASRLIAAGNVMAATLCYICAGNI
EKTVEIWSRSLKYDYDGRSFVDHLQDVMEKTVVLALATG
QKRVSPSLSKLVENYAELLASQGLLTTAMEYLKLLGTEE
SSHELSILRDRLYLSGTDNKVEASSFPFETRQDLTESQY
NMHQTGFGAPETQKNYQENVHQVLPSGSYTDNYQPTANT
HYIAGYQPAPQQQPSFQNYFTPASYQPAPSPNVFYPSQV
SQAEQSNFAPPVNQPPMKTFVPSTPPILRNVDQYQTPSL
NPQLYQGVSSATVETHPYQTGAPASVSVGTTPGQPSVVP
NFMVPGPVTAPTVTPRGFMPVTTPTQHPLGSANPPVQPQ
SPQSSQVQSV
186 The amino acid sequence of SEQ ID MAGAADSQLQTLSERDSTPNFKNLHTREYAAHKKKVHSV
455. The conserved G-protein beta AWNCTGTKLASGSVDQTARVWNIEPHGHSKTKDLELKGH
WD-40 repeat domains are ADSVDQLCWDPKHSELLATASGDRTVRLWDARSGKCSQQ
underlined and the Trp-Asp (WD) VELSGENINITFKPDGTHIAVGNRDDELTIIDVRKFKPL
repeats signature is in bold. HKRKFSYEVNEIAWNTTGELFFLTTGNGTVEVLSYPSLQ
VLHTLVAHTAGCYCIAIDPIGRYFAVGSADALVSLWDLS
EMLCVRTFTKLEWPVRTISFNHDGQYIASASEDLFIDIA
DVQTGRTVHQISCRAAMNSVEWNPKYNLLAFAGDDKNKY
MQDEGVFRVFGFETP
187 The amino acid sequence of SEQ ID MAATSPVGAGSGRELANPPTDGISNLRFSNHSDHLLVSS
456. The conserved G-protein beta WDRKVRLYDASANSLKGQFVHGGPVLDCCFHDDASGFSG
WD-40 repeat domains are SADNTVRRYDFSTRKEDILGRHEAPVRCVEYSYAAGQVI
underlined. TGSWDKTLKCWDPRGASGQEKTLVGTYSQLERVYSMSLV
GHRLVVATAGRHINVYDLRNMSQPEQRRESSLKYQTRCV
RCYPNGTGFALSSVEGRVAMEFFDLSEAGQAKKYAFKCH
RKSEAGRDTVYPVNAIAFHPIYGTFATGGCDGYVNVWDG
NNKKRLYQYSKYPTSIAALSFSRDGRLLAVASSYTFEEG
EKPHEPDAVFVRSVNEAEVKPKPKVYAAPP
188 The amino acid sequence of SEQ ID MASDDEEGFKNEEAPGVVDEAEVQEGLRACFPLSFGKQE
457. The conserved G-protein beta KKQAPLESIHSATKRPEDPRPRRQLGPPRPPPSILAEQE
WD-40 repeat domains are DSDRFVGPPRPPQFVRDDNDDGEAEIMIGPPRPPAQYSD
underlined and the Trp-Asp (WD) DHDNEETIGPPKPSYLEKGEETDQMVGPSKRGSDDETSG
repeats signature is in bold. DSDDGDDAVDFRVPLSNEIVLRGHTKVVSALAIDQTGSR
VLTGSYDYSVRMYDFQGMTSQLKSFRQLEPAEGHQVRSL
SWSPTSDRFLCVTGSAQAKIFDRDGLTLGEFVKGDMYLR
DLKNTKGHISGLTCGEWHPKEKQTILTCSEDGSLRIWDV
NDFNTQKQVIKPKLAKPGRVPVTACAWGRDGKCIAGGVG
DGSIQVWNLKPGWGSRPDLYVAKGHDDDITGLQFSADGN
ILLTRSTDETLKVWDLRKAITPLQVFRDLPNNYAQTNVA
FSPDERLIFTGTSVERDGNSGGLLCFYDRQTLELVLRIG
VSPVHSVVRCTWHPRHNQVFATVGDKKEGGAHILYDPAL
SERGALVCVARAPRKKSLDDFEAKPVIHNPHALPLFRDE
PSRKRQREKARMDPMKSQRPDLPVTGPGFGGRVGSTKGS
LLTQYLLKEGGLIKETWMEEDPREAILKYADVAAKDPKF
IAPAYAQTQPETVFAETDSEEEQK
189 The amino acid sequence of SEQ ID MKERGQSHAGQPSVDERYTQWKSLVPVLYDWLANHNLVW
458. The conserved G-protein beta PSLSCRWGPQMHQATYKNSQRLYLSEQTDGTVPNTLVIA
WD-40 repeat domains are TCEVVKPRVAAAEHISQFNEEARSPFVKKFKTIIHPGEV
underlined. NRIRELPQNSKIVATHTDGPDVLIWDVDTQPNRQATLGA
ADSRPDLVLTGHKDNAEFALAMSPSAPFVLSGGKDKCVL
LWSIQDHISAATEPSSAKASKTPSSAHGEKVPKIPSIGP
RGVYKGHKDTVEDVQFCPSNAQEFCSVGDDSALILWDAR
NGNEPVIKVEKAHNADLHCVDWNPHDENLILTGSADNSV
RMFDRRNLTSSGVGSPVHKFEGHSAPVLCVQWCPDKASV
FGSAAEDSYLNVWDYEKVGKNVGKKTPPGLFFQHAGHRD
KVVDFHWNSFDPWTIVSVSDDGESTGGGGTLQIWRMSDL
IYRPEDEVLAELERFEAHILSCQNK
190 The amino acid sequence of SEQ ID MSSLSRELVFLILQFLDEEKFKESVHKLEQESG NMK
459. The conserved G-protein beta YFDEKAQAGEWDEVERYLSGFTKVDDNRYSMKIFFEIRK
WD-40 repeat domains are QKYLEALDRQDRAKAVDILVKDLKVFSTFNEELYKEITQ
underlined. The Lissencephaly LLTLDNFRENEQLSKYGDTKSARTIMMSELKKLIEANPL
type-1-like homology motif is in FREKLIYPNLKASRLRTLINQSLNWQHQLCKNPRPNPDI
bold and the CTLH, C-terminal to KTLFTDHACGPPNGARTPTQPTASLGVLPKATTFTPIGP
LisH motif is in italics. HGPFPSSSTATSGLASWMSNPNMVTSPQAPVAVGPSVPV
PPNQATLLKRPRTPPGSSSVVDYQTADSEQLIKRLRPVS
QSIDEATYPGPTLRVPWSTDDLPKTLARALNEPYPVTSI
DFHPSQQTFLLVGTKNGEITLWEVGSREKLATRSFKIWD
NANCSNHLEAAFVKDSSVSINRVLWSPDGTLIGIAFTKH
LVHTYTFQGLDLRQHLEIDAHVGGVNDLAFSHPNKQLCV
VTCGDDKMIKVWDAVTGRKLYNFEGHDAPVYSVCPHHKE
NIQFIFSTAVDGKIKAWLYDHLGSRVDYDAPGHSCTTMM
YSADGTRLFSCGTSKEGESFLVEWNESEGAIKRTYSGLR
KKGSGVVQFDTTQNHFLAVGDEHLIKFWDMDSTNMLTSC
DAEGGLLNLPRLRFNKEGSLLAVTTVNGIKILANADGQK
LLKTMENRTFDLPSRAHIDAASATSSPATGRMERIERTS
SANTVSGINGVDPAQSSEKLRLSDDLSEKTKIWKLTEIT
DSIQCRCITLPENAAEPASKVSRLLYTNSGVGLLALGSN
AVHKLWKWNRSEQNPSGKATASVHPQRWQPTSGLLMTND
ITDINPEEAVPCIALSKNDSYVMSASGGKVSLFNMMTFK
VMTTFMPPPPASTFLAFHPQDNNIIAIGMEDSTIHIYNV
RVDEVKTKLKGHQKRITGLAFSSTQNILVSSGADAQLCV
WNTETWEKRKSKTIQMPVGKTVSGDTRVQFHSDQLHILV
VHETQLAIYDAYKLERQYQWVPQDALSAPILYATYSCNR
QLIYATFSDG
191 The amino acid sequence of SEQ ID MAKDEEEFRGEMEERLVNEEYKIWKKNTPFLYDLVITHA
460. The conserved G-protein beta LEWPSLTVQWLPDREEPPGKDYSVQKMILGTHTSDNEPN
WD-40 repeat domains are YLMLAQVQLPLEDAENDARQYDDERGEIGGFGCANGKVQ
underlined. VIQQINHDGEVNRARYMPQNPFIIATKTVSAEVYVFDYS
KHPSKPPQDGGCHPDLRLRGHNTEGYGLSWSPFKHGHLL
SGSDDAQICLWDINVPAKNKVLEAQQIFKVHEGVVEDVA
WHLRHEYLFGSVGDDRHLLIWDLRTSATNKPLHSVVAHQ
GEVNCLAFNPFNEWVLATGSADRTVKLFDLRKISSALHT
FSCHKEEVFQIGWSPKNETILASCSADRRLMVWDLSRID
EFQTPEDALDGPPELLFIHGGHTSKISDFSWNPCEDWVI
ASVAEDNILQIWQMAENIYHDEEDDMPPEEVV
192 The amino acid sequence of SEQ ID MSPGVKQTGSQKFESGHQDVVHDVTMDYYGKRIATCSAD
461. The conserved G-protein beta RTIKLFGLNASDTPSLLASLTGHEGPVWQVAWAHPKFGS
WD-40 repeat domains are MLASCSYDGRVIIWREGQQENEWSQVQVFKEHEASVNSI
underlined. SWAPHELGLCLACGSSDGSITVFTCREDGSWDKTKIDQA
HQVGVTAVSWAPASAPGSLVGQPSDPIQKLVSGGCDNTA
KVWKFYNGSWKLDCFPPLQMHTDWVRDVAWAPNLGLPKS
TIASCSQDGKVVIWTQGKEGDKWEGRILNDFKIPVWRVN
WSLTGNILAVADGNNSVTLWKEAVDGDWNQVTTVQ
193 The amino acid sequence of SEQ ID MSSGVKQTGSQKFESGHQDVVHDVTMDYYGKRIATCSAD
462. The conserved G-protein beta RTIKLFGMNTSDTPTLLASLTGHEGPVWQVAWAHPKFGS
WD-40 repeat domains are MLASCSYDRRVIIWREGQQENEWSQVQVFKEHEASVNSI
underlined. SWAPHELGLCLACGSSDGSITVFTGREDGSWDKTKIDQA
HQVGVTAVSWAPASAPGSLVGQPSDPVQKLVSGGCDNTA
KVWKFYNGSWKLDCFPPLQMHTDWVRDVAWAPNLGLPKS
TIASCSQDGRVVIWTQGKEGDKWEGKILNDFKTPVWRIS
WSLTGNILAVADGNNNVTLWKEAVDGEWNQVTTVQ
194 The amino acid sequence of SEQ ID MKKRSRPSNGHLSTAAKNKSRKTAPITKDPFFDSAHNRN
463. The conserved G-protein beta KSKGKGKSRGKGEEIFSSDEDDDAIGRDAPAEEEEEIAE
WD-40 repeat domains are EERETADEKRLRVAKAYLDKIRAITKANEEDNEEEAGED
underlined. EETEAERRGKRDSLVAEILQQEQLEESGRVQRQLASRVV
TPSKLVECRVVKRHKQSVTAVALTEDDLRGFSASKDGTI
IHWDVETGASEKYEWPSQAVSVSSSNEVSKTQKSKGSKK
QGSKHVLSMAVSSDGRYLATGGLDRYIHLWDTRTQKHIQ
AFRGHRGAVSCLAFRQGTQQLISGSFDRTIKLWSAEDRA
YMDTLYGHQSEILAVDCLRKERVLSVGRDHTLRLWKVPE
ETQLVFRGHAASLECCCFINNEDFLSGSDDGSIELWSML
RKKPVFMAKNAHGHAIVENLSEDTSTREEPDEEVTTRQL
PNGNSIGNGMTNQMGITPSVESWVGAVTVCRGTDLAASG
AGNGVVRLWAIENSSKSLRALHDIPLTGFVNSLTFARSG
RFLIAGVGQEPRLGRWGRIQAARNGVTLCPIELS
195 The amino acid sequence of SEQ ID MAATFGTINTATSPHNPNKSFEIVQPPNDSISSLSFSPK
464. The conserved G-protein beta ANYLVATSWDNQVRCWEVLQTGASMPKAAMSHDQPVLCS
WD-40 repeat domains are TWKDDGTAVFSAGCDKQAKMWPLLTGGQPVTVAMHDAPI
underlined. KDIAWIPEMNLLATGSWDKTLKYWDTRQSNPVHTQQLPE
RCFALSVRHPLMVVGTADRNLIIFNLQNPQTEFKRISSP
LKYQTRCVAAFPDKQGFLVGSIEGRVGVHHVEEAQQSKN
FTFKCHRDSNDIYAVNSLNFHPVHQTFATAGSDGAFNFW
DKDSKQRLKAMARSNQPIPCSTFNSDGSLYAYAVSYDWS
KGAENHNPATAKHHILLHVPQESEIKGKPRVTTSGRK
196 The amino acid sequence of SEQ ID MVVMDKGTHQTNEDESESEFIDEDDVIDEISIDEEDLPD
465. The conserved G-protein beta ADVEGEDVQEDNKRSEPDENSSSLDDAIHTFEGHEDTLF
WD-40 repeat domains are AVACSPVDATWVASGGGDDKAFMWRIGHATPFFELKGHT
underlined. DSVVALSFSNDGLLLASGGLDGVVRIWDASTGNLIHVLD
GPGGGIEWVRWHPKGHLVLAGSEDYSTWMWNADLGKCLS
VYTGHCESVTCGDFTPDGKAICTGSADGSLRVWNPQTQE
SKLTVKGYPYHTEGLTCLSISSDSTLVVSGSTDGSVHVV
NIKNGKVVASLVGHSGSIECVRFSPSLTWVATGGMDKKL
MIWELQSSSLRCTCQHEEGVMRLSWSLSSQHIITSSLDG
IVRLWDSRSGVCERVFEGHNDSIQDMVVTVDQRFILTGS
DDTTAKVFEIGAF
197 The amino acid sequence of SEQ ID MPVFRTAFNGYAVKFSPFVETRLAVATAQNFGIIGNGRQ
466. The conserved G-protein beta HVLELTPNGIVEVCAFDSSDGLYDCTWSEANENLVVSAS
WD-40 repeat domains are GDGSVKIWDIALPPVANPIRSLEEHAREVYSVDWNLVRK
underlined and the Trp-Asp (WD) DCFLSASWDDTIRLWTIDRPQSMRLFKEHTYCIYAAVWN
repeats signature is in bold. PRHADVFASASGDCTVRIWDVREPNATIIIPAHEHEILS
CDWNKYNDCMLVTGSVDKLIKVWDIRTYRTPMTVLEGHT
YAIRRVKFSPHQESLIASCSYDMTTCMWDYRAPEDALLA
RYDHHTEFAVGIDISVLVEGLLASTGWDETVYVWQHGMD
PRAC
198 The amino acid sequence of SEQ ID MDSRNRRSRLNLPPGMSPSSLHLETTAGSPGLSRVNSSP
467. The conserved G-protein beta STPSPSRTTTYSDRFIPSRTGSRLNGFALIDKQPQPLPS
WD-40 repeat domains are PTRSAAEGRDDASSSSASAYSTLLRNELFGEDVVGPATP
underlined. ATPEKSTGLYGGSRDSIKSPMSPSRNLFRFKNDHGGNSP
GSPYSASTVGSEGLFSSNVGTPPKPARKITRSPYKVLDA
PALQDDFYLNLVDWSSNNVLAVGLGTCVYLWSACTSKVT
KLCDLGVNDSVCSVGWTPQGTHLAVGTNIGEVQIWDTSR
CKKVRTMGGHCTRAGALAWSSYILSSGSRDRNILHRDIR
VQDDFIRKLVGHKSEVCGLKWSYDDRELASGGNDNQLLV
WNQQSAQPLLRFNEHTAAVKAIAWSPHQHGILASGGGTA
DRCLRFWNTATDTRLNCVDTGSQVCNLVWCKNVNELVST
HGYSQNQIMVWRYPSMSKLATLTGHTLRVLYLAISPDGQ
TIVTGAGDETLRFWSIFPSPKSQSAVHDSGLWSLGRTHIR
199 The amino acid Sequence of SEQ ID MEKKKVVVPIVCHGHSRPIVDLFYSPVTPDGLFLISASK
468. The Conserved G-protein beta DSSTMLRNGETGDWIGTFEGHKGAVWSCCLDNRALRAAS
WD-40 repeat domains are GSADFSAKIWDALTGDELHCFVHKHIVRACAFSESTSLL
underlined. LTGGHEKILRIFDLNRPDAPPKEVDNSPGSIRTVAWLHS
DQTILSSNSDAGGVRLWDLRTEKIVRVLETKSPVTSAEV
SQDGRYITTADGNSVKFWDANHFGMVKSYTMPCMVESAS
LEPTMGNMFVAGGEDMWVRLFDFHTGEEIACNKGHHGPV
HCVRFAPGGESYSSGSEDGTIRIWQTLNMNSEENESYGV
NGLSGKVRVGVDDVVQKVEGFQITADGHLNDKPEKPNP
200 The amino acid Sequence of SEQ ID MERYSQGTQKKSEIYTYEAPWQIYGMNWSVRKDKKFRLG
469. The Conserved G-protein beta IGSFLEEYNNRVEIIELDEESGEFKSDPRLAFDHPYPTT
WD-40 repeat domains are KIMFVPDKECQRPDLLATTGDYLRIWQVCEDRVEPKSLL
underlined. NNNKNSEFCAPLTSFDWNDADPKRIGTSSIDTTCTIWDI
EKEVVDTQLIAHDKEVYDIAWGEVGVFASVSADGSVRVF
DLRDKEHSTIIYESSQPETPLLRLGWNKQDPRFIATILM
DSCKVVILDIRFPTLPVAELQRHQASVNTIAWAPHSPCH
ICTAGDDSQALIWELSSVSQPLVEGGGLDPILAYTAAAE
INQLQWSSMQPDWVAIAFSNEVQILRV
201 The amino acid sequence of SEQ ID MQSENNLDESLHLREVQELQGHTDTVWAVAWNPVTGIDG
470. The conserved G-protein beta APSMLASCSGDKTVRIWENTHTLNSTSPSWACKAVLEET
WD-40 repeat domains are HTRTVRSCAWSPNGKLLATASFDATTAIWENVGGEFECI
underlined. ASLEGHENEVKSVSWSASGMLLATCGRDKSVWIWDVQPG
NEFECVSVLQGHTQDVKMVQWHPNRDILVSASYDNSIKV
WAEDGDGDDWACMQTLGNSVSGHTSTVWAVSFNSSGDRM
VSCSDDLTLMVWDTSINPAERSGNAGPWKHLCTISGYHD
RTIFSVHWSRSGLIASGASDDCIRLFS
202 The amino acid sequence of SEQ ID MKRAYKLQEFVAHASNVNCLKIGKKSSRVLVTGGEDHKV
471. The conserved G-protein beta NMWAIGKPNAILSLSGHSSAVESVTFDSAEALVVAGAAS
WD-40 repeat domains are GTIKLWDLEEAKIVRTLTGHRSNCISVDFHPFGEFFASG
underlined and the Trp-Asp (WD) SLDTNLKIWDIRRKGCIHTYKGHTRGVNSIRFSPDGRWV
repeats signature is in bold. VSGGEDNIVKLWDLTAGKLMHDFKCHEGQIQCMDFHPQE
FLLATGSADRTVKFWDLETFELIGSAGPETTGVRAMIFN
PDGRTLLTGLHESLKVFSWEPLRCYDAVDVGWSKLADLN
IHEGKLLGCSYNQSCVGVWVVDISRVGPYAAGNVSRTNG
HNEAKLASSGHPSVQQLDNNLKTNMARLSLSHSTESGIK
EPKTTTSLTTTEGLSSTPQRAGIAFSSKNLPASSGPPSY
VSTPKKNSTSRVQPTTNFQTLSRPDIVPVIVPRSNSLRP
ETTSDVKKEMNNFGRVVPSTVSTKSTDVIKSGSNRDESD
KIDSINQKRMTGNDKTDLNIARAEQHVSSRLDNTNTSSV
VCDGNQPAARWIGAAKFRRNSPVDPVVSPHDRSPTFPWS
ATDDGVTCQPDRQVTAPELSKRVVEPGRARALVASWETR
EKALTADTPVLVSGRPPTSPGVDMNSFIPRGSHGTSESD
LTVSDDNSAIEELMQQHNAFTSILQARLTKLQVIRRFWQ
RNDLKGAIDATGKMGDHSVSADVISVLIERSEIFTLDIC
TVILPLLTRLLQSETDRHLTVAMETLLVLVKTFGDVIRA
TISATPTIGVDLQAEQRLERCNLCYVELENIKQILVPLI
RRGGAVAKSAQELSLALQEV
203 The amino acid sequence of SEQ ID MSTLEIEARDVIKIVLQFCKENSLHQTFQTLQNECQVSL
472. The conserved G-protein beta NTVDSLETFVADINSGRWDVILPQVAQLKLPRKKLEDLY
WD-40 repeat domains are EQIVLEMIELRELDTARAILRQTQAMGFMKQEQPERYLR
underlined and the Trp-Asp (WD) LEHLLVRTYFDPREAYHESSKEKRRSQIAQALASEVTVV
repeats signature is in bold. PPSRLMALIGQSLKWQQHQGLLPPGTQFDLFRGTAAVKA
DEEEMYPTTLAHTIKFGKQSHPECARFSPDGQYLVSCSV
DGFIEVWDYISGKLKKDLQYQADDSFMMHDDAVLCVDFS
RDSEMLASGSQDGKIKVWRIRTGQCLRRLERAHSQGVTS
LSFSRDGSQLLSTSFDSTARIHGLKSGKALKEFRGHTSY
VNDAIFTSDGGRVITASSDCTVKVWDVKTTDCIQTFKPP
PPLKGGDVSVNSVHLFPKNSEHIVVCNKASSIYIMTLQG
QVVKSFSSGKREGGDFVAACISPKGEWIYCVGEDRNIYC
FSQQSGKLEHLMKAHDKDIIGVTPHPHRNLLVTYSEDST
MKIWKP
204 The amino acid sequence of SEQ ID MDIELEDQPFDLDFHPSAPIVAVALITGRLQLFRYVDIS
473. The conserved G-protein beta SEPERLWTVTAHTESCRAARFINAGSSVLTASPDCSILA
WD-40 repeat domains are TNVETGQPVARLDNAHGAAINCLTNLTESTIASGDENGI
underlined. IKVWDTRQNSCCNKFKAHEDYISDMEFVPDTMQLLGTSG
DGTLSVCNLRKNKVHARSEFSEDELLSVALMKNGKKVVC
GSQEGVLLLYSWGYFKDCSDRFVGHPHSVDALLKLDEDT
VLTGSSDGIIRVVSILPNKMIGVIGEHSSYPIERLAFSH
DRNVLGSASHDQILKLWDIHYLHEDDEPETNKQEAVNDE
NVDMDLDVDTEKRPRGSKRKKRAEKGQTSSQKQSSDFFA
DI
205 The amino acid sequence of SEQ ID MDRIQQIPHTCVARKINLPLGMSKESLALNLPANLAPTM
474. The conserved G-protein beta SPPSITYSDRFIPSRKASNFEEFALPDKTSPSPNSAGGQ
WD-40 repeat domains are SSSTNGEGRDDACAAYSALLRTELFPATPDKTEGCRRPV
underlined. IGSPSGNVFRFKSQQCKSQSPFSLCPVGEDGDLSETGAV
ARKTTRKIPRSPFKVLDAPALQDDFYLNLVDWSSHNILA
VGLSACVYLWSASSSKVTKLCDLGLDDNVCSVAWTQRGT
YLAVGTNNGGVQIWDAAHCKQVRTMEGHCTRVGTLAWNS
HILSSGGRDRNILQRDIRAQDDFVSKFSGHKSEVCGLKW
SYDNRELASGGNDNQLFVWNQQSQQPVLKYNEHTAAVKA
IAWSPHQHGLLASGGGTADRCIRFWNTATNTSLNCVDTG
SQVCNLVWSKNVNELVSTHGYSQNQIIVWRYPTMSKLAT
LTGHTLRVLYLAISPDGQTIVTGAGDETLRFWNVFPSSK
TQQNTIRDMGVWSSGRTHIR
206 The amino acid sequence of SEQ ID MAGGQGEGEEKVDKLSMELTEDVMKSMEIGAVFKDYNGK
475. The conserved G-protein beta INSLDFHRTNNYLVTASDDEAIRLFDTASATWQKTSYSK
WD-40 repeat domains are KYGVDLICFTNHQTSVLYSSKNGWDESLRHLSLMDNKYL
underlined. RYFKGHHDRVVSLCMSPKGECFMSGSLDRTVLLWDLRID
KCQGLIRVRGRPAVAYDEQGLVFAISNEGGLIKMFDARL
YDKGPFDTFVVEGDKSEASGIKFSNDGKLILLSTMDSNI
HVLDAYQGTTVHSFSVEAVPNGGEAVPNGGTLEASFSPD
GKFVISGSGNGNIHAWSVNSGKEVACWTTEGVIPAVVKW
APRRLMFASGSSVLSLWVPDLSKLASLTGSNSNSAY
207 The amino acid sequence of SEQ ID MHRVGSTGNTSNSSRPRREKRLTYVLNDANDSRHCSGIN
476. The conserved G-protein beta CLVISKLSLLGGNDYLFSGSRDGTLKRWELADDSAVCSA
WD-40 repeat domains are TFESHVDWVNDAVLTGETLVSCSSDTTLKTWRPFSDGVC
underlined. TRTLRQHSDYVTCLAAASKNSNIVASGGLGREVFIWDIE
AAMAPVSRTSEAMDDDTSNGVLSSGNSVLSTTVRSTNAT
NSASLHTSQLQGYTPIAAKGHKESVYALAMNDVGTLLVS
GGTEKVVRVWDPRSGAKQMKLRGHTDNVRALILDSTGRF
CLSGSSDSIIRLWDLGQQRCVHSYAVHTDSVWALASTPN
FSHVYSGGRDLSLYLTDLTTRESLLLCMEKHPLLRLTLQ
DDSIWVATTDSSLHRWPAEGQNPPKMFQRGGSFLAGNLS
FTRARACLEGSAPVPVNTQPSFVIPGSPGIVQHEILNNR
RHVLTKDAEGTVKLWEITRGAVLDDYGKVSFEEKKEELF
EMVSIPAWFTMDTRLGSMSVHLDTPQCFTAEMYAVDLNV
PDAPEEQKINLAQETLRGLLAHWLSRRRQRLATQASANG
DFPAGQENALRNHISSRIDVHDDAETHIAGILPAFDFST
TSPPSIITEGSQGGPWRKKITDLDGTEDEKDFPWWCLEC
VLHGRLSPRESLKCSFYLHPYEGTTVQVLTQGKLSAPRI
LRIQKVINYVLEKMVLDRPLDSSNSETTFTPGLSGNQSH
AAVVGDGSLRSGARVWQQKAKPLVEILCNNQVLSPDMSL
ATVRTYIWKKPDDLYLYYRLVQNR
208 The amino acid sequence of SEQ ID MMKGKTIQMQAAHQNHDGETSVACVLWDWHAKHLITAGA
477. The conserved G-protein beta DNTILIHSYPSSSSSKPITLRHHKNAVTALAINSNVRSL
WD-40 repeat domains are ASGSVDHSVKLYSYPGGEFQSNVTRFTLPIRSLAFNKSG
underlined. ELLAAAGDDEGIKLISTIDNSIARVLKGHNGPVTSISFD
PKNEFLASSDSDGTVIYWELSTGKPVHTLKKIAPNTTSN
PTSLNQISWRPDGEMLAVPGRKSEVSMYDRDTAEKLFSL
KGGHSDTICSLAWSPNGKYIATAGTDRQVMVWDADRRQD
IDKQRFDNPICSVAWKPSDNALAVIDVLGRFGVWESPIA
SHMKSPADGAERYDNMEDEEPLMARYEEELEDSVSGSLN
EIINDDDDDDEMGKIPRKILQKKPSVKVEKGKEESNAKA
FKSGQDSFKLKSAMQEAFQPGATQRQSGKRNFLAYNMLG
SVITFDNDGFSHIEVDFHDIGKGCRVPSMTDYFGFTMAS
LSESGSVFGSPQKGEKNPSTLMYRPFSSWANNSEWSMRF
PMGEEVKAVALGSGWVAAVTSLNFLRVFSEGGLQKFVLS
MDGPVVTAAGYENLLVVVSHASNPLLSGDQVLSFTVYDI
SQKTCPLSGRLPLSPGSHLTWLGFSEEGLLSSYDSEGNL
RVFTNDYNGCWVPIFSAARERKSETESIWMVGLNSTQVF
CVVCKLPDTYPQVAPKPVLSVLNLSLPLACSDLGADDLE
NEYLRGSLLLSQMQKKAEDAVACGRESNMEEDSIFKMEA
ALDRCLLRLIANCCKGDKLVRATELARLLSLEKSLQGAI
KLVSAMKLPMLAERFNTILEEKILQENMETISCRRLTSE
AQDMDTPISISVKQVSYGANLGDSPFLPNRQVEPKHSTP
VFSKPDTKIEVDTSEAIAKGCDAQNGNIKSGDAEVQPAS
HNDSIQKPSNPFAKASNTSANQAVQRNASLLSSIKQMKT
ATENEGKRKERARSGSLPQKPAKQSKIS
209 The amino acid Sequence of SEQ ID MKQKRKGHQVDDPKYSVQTPQEDDTPNESGPASEEVESS
478. The conserved G-protein beta DEEGGNSSNIEDDIIYSSSEEDPVVSSDYEEDEDAESDA
WD-40 repeat domains are EGVTAEQELEGDIDNALQNYMGTLTVLSNFHGENLKNAE
underlined. GEDTSGDDDDEEEMPKRAEESDSPEDENDERPKRAEESD
FSEDEDEERPKRAEESDSSEDEVPSRNTVGDVPLRWYKD
EQHIGYDIKGKKIKKQPKKDQLDSFLASTDDSSDWRKVY
DEYNDEEVELTKDEIKFISRLRKGTIPHADVNPYEPYVD
WFDWKDKGHPLSNAPEPKRRFIPSKWEAKKVVKLVRAIR
KGWITFQKAEEKPRFYLMWGDDLKPSEKMANGLSYIPAP
KPKLPGHEESYNPPPEYIPTQEEINSYQLMYEEDRPKFI
PKRFDSLRNVPAYDRFLSEIFERCLDLYLCPRTRKKRIN
IDPESLIPKLPKPKDLQPFPSICFLEYKGHTGAVSCISP
ESSGQWLASGSKDGTVRIWEVETARCLKVWDIGRPIQHI
AWNPVSQLSILAVAVDEEVLVLNTGLGSEDSQEKVAELL
HVKSKPVSADDLGDNTSLTKWIKHEKFDGIKLTHLKPVH
LISWHHKGDYFATVAPDGNTRAVLVHQLSKQQTQNPFKK
MQGRVVHVLFHPSRAIFFVATKTHVRVYDLVKQQLVKRL
VTGLHEVSSMAVHHKGDNLLVGSKEGKVCWFDMDLSTQP
YKTLKNHSKDIHSVAFHDSYPLFASCSDDCKAYVFYGLV
YSDLLQNPLIVPLKVLQGHQSVNGMGVLDCQFHPKQPWL
FTAGADSVVKLYCN
210 The amino acid sequence of SEQ ID MMSLKRGFEESLVPAKRQKTELSTVTYGDGPRRTSSLES
479. The conserved G-protein beta PIMLLTGHHAAIYTMKFNPTGTVIASGSHEREIFLWNVH
WD-40 repeat domains are GDCKNFMVLKGHKNAVLDLHWTTDGCQIISASPDKTLRA
underlined. WDVETGKQIKKMAEHSSFVNSCCPSRRGPPLVVSGSDDG
TAKLWDLRHRGAIQTFPDKYQITAVGFSDAADKIYSGGI
DNEIKVWDLRRGEVTMRLQGHTDTITGMQLSSDGSYLLT
NSMDCSLRIWDMRPYAPQNRCVKILTGHQHNFEKNLLKC
SWSSDGSKVTAGSADRMVYIWDTTTRRILYKLPGHTGSV
NETGFHPTQPIIGSCSSDKQIYLGEIEPNVGYQAVI
211 The amino acid sequence of SEQ ID MEFSDTYKHTGPCCFSPDARYLAIAVDYRLVIRDVVTLK
480. The conserved G-protein beta VVQLYSCMDKISNIEWALDSEYILCGLYKRAMVQAWSLS
WD-40 repeat domains are QPEWTCKIDEGPAGIAHARWSPDSRHIITTSDFQLRLTV
underlined. WSLVNTACIHIQWPKHASKGVSETQDSKfAAIATRRDCK
DYVNLLSCHTWEVMGTFTVDTIDLADLEWSPNDSAIVVW
DSPLEYKVLIYSPDGRCLFKYQAYDSWLGVKTVAWSPCS
QFLAVGSYDQTLRTLNHLTWKPFAEFVHVSTVRGPASAV
VFKEVEEPWNLDVSGLHLNDDNAHDIQDGKPAEGHSRVR
YKVVEFPVNVSSQKHPVDKPNPKQGIGLLAWSRDSQYLF
TRNDNMPTALWIWDICRLELAALLIQKEPIRAAAWDPVY
PRVALCTGSSHLYMWTPSGACCVNIPLPQFVVSDLKWNP
DGTSMLLKDRESFCCTFVPMLPEFNDDETNEE
212 The amino acid sequence of SEQ ID MAKLIETHSCVPSTERGRGILIAGDAKTNSIIYCNGRSV
481. The conserved G-protein beta IMRNLDNPLEASVYGEHSYPATVARFSPNGEWVASGDTS
WD-40 repeat domains are GTVRIWGRGSDHTLKYEYKALAGRIDDLEWSADGQRIVV
underlined. CGDSKGKSMVRAFMWDSGTNVGEFDGHSRRVLSCSFKPT
RPFRVATCGEDFLVNFYEGPPFRFKTSHRDHSNYVNCVR
FAPDGSKFITVGSDRKGVIFDGKMGEKIGELSKEGGHTG
SIYAASWSPDSKQVLTVSADKSAKIWEISETGNGTVKKT
LTFGSQGGADDMLVGCLWLNDYLITVSLGGIVSLLSAVD
PDKPPKTISGHMKSINAIALSLQSGQSEVCSSSYDGVIV
RWILGVGYAGRVERKDSTQIKCLATIEGELVTCGFDNKV
RRVPLLSEQHKESEPIDIGAQPKDLDVAVGCPELTFVST
DAGIIIIRASKIVSTTNVGYAVTAAAISPDGTEAVVGGQ
DGKLRVYSIKGDTLLEESVLERHRGPINAIRFSPDGSMF
ASGDLNREAVVWDRITREVKLKNMVYHTARINCIAWSPD
SSKVATGSLDTCILIYEVGKPASSRITIKGAHLGGVYGL
AFSDQSTVISAGEDACVRVWSLP
213 The amino acid sequence of SEQ ID MPQPSVILATAGYDHTVRFWEATSGRCYRTLQYPDSQVN
482. The conserved G-protein beta HLEITPDKQYLAAAGNPHIRLFEVNSNNPQPVISYDSHT
WD-40 repeat domains are NNVTAVGFQCDGKWMYSGSEDGTVKIWDLRAPGFQREYE
underlined and the Trp-Asp (WD) SRAAVNTVVLHPNQTELISGDQNGNIRVWDLNANSCSCE
repeats signature is in bold. LVPEDTAVRSLTVMWDGSLVVAANNHGTCYVWRLMRGTQ
TMTNFEPLHKLQAHNSYILKCLLSPEFCEHHRYLATTSS
DQTVKIWNVDGFTLERTLTGHQRWVWDCVFSVDGAFLVT
ASSDSTARLWDLSTGEAIRTYQGHHKATVCCALHDGTDG
ASC
214 The amino acid sequence of SEQ ID MLTKFETKSNRVKGLSFHPKRPWILASLHSGVIQLWDYR
483. The conserved G-protein beta MGTLIDKFDEHDGPVRGVHFHKTQPLFVSGGDDYKIKVW
WD-40 repeat domains are NYKMRQCLFTFVGHLDYIRTVHFHNEYPWIVSASDDQTI
underlined and the Trp-Asp (WD) RLWNWQSRVCISVLTGHNHYVMSASFHPKEDLVVSASLD
repeats signature is in bold. The QTVRVWDISGLRKKTVSPADDLSRLAQMNTDLFGGGDVV
coatomer WD associated region is VKYVLEGHDRGVNWAAFHTSLPLIVSGADDRQVKLWRMN
in bold/italics. DTKAWEVDTLRGHTNNVSCVIFHARQDIIVSNSEDKSIR
VWDMSKRTSVQTFRREHDRFWILAAHPEMNLLAAGHDSG
MIVFKLERERPAYVVYGGSLLYVK
IPVLPPGKKSSLLMPPAPILHGGDWPLLRVTKGIFE
GGLENSTSAAYEEEDEEAAADWGEDIDIENIEGENGEAT
VLDDQEVKGGEDDEGGWDMEDLELPPDVAAANVGTNQKT
LFVAPTLGMPVSQIWMQKSSLAGEHAAAGNFETALRLLT
RQLGIKNFSPLKPLFLELYMGSHTFLPSFASVPAFSLAL
QRGWSESASPNIRGPPALVYRLSVLEEKLTVAYRATTEG
RFSEALRLFL
215 The amino acid sequence of SEQ ID MDLLQNYQDDSEDSNPELRNHPPLEDATATSAPAGVENE
484. The conserved G-protein beta TSSSPDSSPLRLALPAKSCAPDVDETLMALGVPGSEKKN
WD-40 repeat domains are NHNKPIDPTQHSVTFNPSYDQLWAPLYGPAHPYAKDGIA
underlined. QGMRNHKLGFVEDSAIEPFMFDEQYNTFHRYGYAADPSA
SLGSHIVGDLESLKKNDGASVYNLPKREHKRQKLEKKMI
QKDENEEEEKEVGEEVDNPSTEEWLKKNRKSPWAGKKEG
LQTELTEEQKKYAQEHAEKKGDREKGEKVEIVDKTTFHG
KEERDYQGRSWIDPPKDAKATNDHCYIPKRWVHTWSGHT
KGVSAIRFFPKYGHLLLSAGMDTKVKIWDVFNSGKCMRT
YMGHSKAVRDISFSNDGSRFLSAGYDRNIKLWDTETGKV
ISTFSTGKIPYVVKLHPDEDKQNVLLAGMSDKKIVQWDM
NSGEITQEYDQHLGAVNTITFVDNNRRFVTSSDDKSLRV
WEFGIPVVIKYISEPHMHSMPSISLHPNTNWLAAQSLDN
QILIYSTRERFQLNKKKRFAGHIAAGYACQVNFSPDGRF
VMSGDGEGRCWFWDWKTCKVFRTLKCHDNVCIGCEWHPL
EQSKVATCGWDGMIKYWD
216 The amino acid sequence of SEQ ID MARKGLGTDPAIGSLMSSKKRKEYKVTNRFQEGKRPLYA
485. The conserved G-protein beta IAFNFIDARYHNIFATAGGTRVTIYQCLEGGAISVLQAY
WD-40 repeat domains are VDDDKDESFYTLSWACDVNGSPLLVAGGHNGIIRVLDVA
underlined and the Trp-Asp (WD) NEKVHKSFVGHGDSVNEIRTQALKPSLILSASKDESVRL
repeats signature is in bold. WNVQTGICILIFAGAGGHRNEVLSVDFHPSDVYRIASCG
MDNTVKIWSMKEFWTYVEKSFTWTDLPSKFPTKYVQFPV
FIAAVHSNYVDCTRWLGNFILSKSVDNEVVLWEPYSKEQ
STSDGVVDILQKYPVPECDIWFIKFSCDFHYNSMAVGNR
EGKVYVWELQSSPPNLIARLSHAHCKNPIRQTAISHDGS
TILCCCDDGSMWRWDVVQ
217 The amino acid sequence of SEQ ID MESGAGGSVGARVPSAKPEMLQQPPYSNGDDDNDMERGT
486. The conserved G-protein beta APVPSSNPNTVSKWELDKDFLCPICMQTMKDAFLTACGH
WD-40 repeat domains are SFCYMCIMTHLNNKSNCPCCSLYLTNNQLFPNFLLNKLL
underlined and the Trp-Asp (WD) KKTSACQMASTASPVENLCLSLQQGAEVSVKELDFLLTL
repeats signature is in bold. LAEKKRKMEQEEAETNMEILLDFLQRLRQQKQAELNEVQ
ADLHYIKDDILALEKRRLELSRARERYSRKLHMLLDDPM
DTTLGHAAIDDGNNVRTAFVRGGQGDAISGKFQQKKAEI
KAQASSQGMQKRANFCHSDSQVLPTLSGLTIARKRRVLA
QFDDLQECYLQKRRRWATQLRKQCDGGLRKERDGNSISR
EGYHAGLEEFQSILTTFTRYSRLRVISELRHGDLFHSAN
IVSSIEFDRDDELFATAGVSRRIKVFDFATVVNEPADVH
CPVVEMSTRSKLSCLSWNKCIKSQIASSDYEGIVTVWDV
NTRQSVMMYEEHEKRAWSVDFSRTEPTRLISGSDDGKVK
VWCTRQETSVLNIDMKANICCVKYNPGSSYYVAVGSADH
HIHYYDLRNPSVPLYEFNGHRKTVSYVKFISTNELASAS
TDSTLRLWDVRDNCLVRTFKGHTNEKNFVGLTVNSEYIA
CGSETNGVFVYHKAISKPAAWHQFGSPDLDDSDDDTSHF
ISAVCWKSESPTMLAANSQGTIKVLVLAP
218 The amino acid sequence of SEQ ID MANYVDSKKNFKCVPALQQFYTGGPFRLSSDGSFLVCAC
487. The conserved G-protein beta NDEVKVVDLATGSVKNTLEGDSELIVALALTPDNKYLFS
WD-40 repeat domains are ASRSTQIKFWDLSSATCKRTWKAHNGPVADMACDASGGL
underlined. LATAGADRSILVWDVDGGYCTHSFRGHQGVVTTVIFHPD
PHCLLLFSGSDDATVRIWDLVAKKCISVLEKHFSTVTSL
AISENGWNLLSAGRDKVVNIWDLRDYHCRATIPTYEPLE
AVCVLPTGSRLVSVMNQSRALPENRKKSGAAPVYFLTVG
ERGIVRIWYSEGALCLYEQKSSDAIISSDKDELKGGFVS
AVLLPLTQGVMCVTADQRFLFYNLDESDEGKCDLKVSKR
LIGYNEEIVDLKFLGDEEKFLAVATNLEQVRMYDLSSMT
CVYELSGHTDIVLCLDTVVFSGHSLLASGSKDHTVRIWD
TESKSCICVAAGHMGAVGAVAFSKKAKNFFVSGSSDRTI
KVWSFASVLDFGGISKSIKLSSQAAVAAHDKDINSVAVA
PNDSLICTGSQDRTARIWRLPDLVPVLVLRGHKRGVWCV
EFSPVDQCVMTASGDKTIKIWALSDGSCLKTFEGHTASV
LRASFLTRGTQFVSSGADGLLKLWTIKSNECIATFDQHE
DKIWAMAVGKKTEMLATGGSDSLVNLWHDCTTTDEEEAL
LKEEEAALKDQELLNALADTDYVKAIQLAFELRRPYKLL
NVFTELYSKGHAQDQIQKVIRELGNEELRLLLEYVREWN
TKPKFAHVAQFVLFQLFNVLPPKEIIEVQGISELLEGLI
PYAQRHYSRIDRLMRSTFLLDYTLSSMSVLSPTETDLSS
SNLLARTADPLHAQIDQFHPTHFPEPNLTPIQSLLDSGN
TDSVEVTARRAKKKRVSGNDSEKTTVAEVKIGDMENAFD
EPDVADQGSSRKHKPASSKKRKSIAVGNASIKRIASGNA
VTIALQV
219 The amino acid sequence of SEQ ID MESSCSSMNSNRHSTEKRCLRPLQKQGASMNKHSSDRFI
488. The conserved G-protein beta PARGSIDLDVARFMVTQKQKDNNDIHALSPSPSPSKKAY
WD-40 repeat domains are QKEMADTLLKNAGAADNNCRILSFNGKSSTVSQGSQENV
underlined. LANLSISRRARRYIPQSADRTLDAPDLLDDYYLNLLDWS
STNVLSTALGNTVYLWDASNSSISELLIADEEEGPVTSV
SWAPDGSQIAVGLNNSVVQLWDSQSNKKLRALKGHHDRV
GALSWNGPILTTGGLDGIIINHDVRTRDHIVQTYKGHTQ
EVCGLKWSPSGQQLASGGNDNLLYIWDKSMASHNPSSQY
FHQLDEHCAAVKALAWCPFQTNLLASGGGTSDGSIKFWN
TQTGACLNTVDTHSQVCSLLWNRHERELLSSHGLNQNQL
TLWKYPSMVKITELTGHTARVLHMAQSPDGYTVASAAAD
ETLKFWQVFGAPDASKKTKDTKGAFNMFHMHIR
220 The amino acid sequence of SEQ ID MLDEIVADEEEEFNIWKKNTPLLYDVVITHALEWPSLTV
489. The conserved G-protein beta QWLPDRHQSPTKDYSLQKMIVGTHTSGDEPNYLMIAEVQ
WD-40 repeat domains are MPLQYSEDGNVGGFESTEAKVHIIQQINHEGEVNRAQYM
underlined. PQNSFIIATKTVSSDVYVFDYTKHSSNAPQERVCNPELI
LKGHTNEGYSLSWSPLKEGQLLSGSNDAQICFWDINAAS
GRKVVEAKQIFKVHEGAVEDVSWHLKHEYLFGSVGDDCH
LLIWDTRTAAPNKPQHSVVAHESEVNSLAFNPFNEWLLA
TGSADKTVKLFDLRKLSCSLHTFSNHTEEVFQIEWSPMN
ETILASSGGDRRLMVWDLRRIGDEQTSEDAEDGPPRLIF
IHGGHTSKISDFSWNLHDDWLIASVAEDNILQIWQMAEN
IYHDDADIL
221 The amino acid sequence of SEQ ID MTKEDHGESRDEMGERMVNEEYKLWKKNTPFLYDLVITH
490. The conserved G-protein beta ALEWPSLTVQWLPPSCKQQQDIIKDDDIDHPNTQMVILG
WD-40 repeat domains are THTSDNEPNYLILAEVQLHDGTEDEDGDGDVKRPQDKMK
underlined. PGTSGGAMGKVRILQQINHQKEVNRARYMPQKPTIIATK
TVNADVYVFDYSKHPSKPPQEGRCNPELRLQGHESEGYG
LSWSPLKEGHLLSASDDAQICLWDITAATKAPKVVEANQ
IFRYHDGPVEDVAWHAIHDHLFGSVGDDHHLLLWDIRND
SEKPLHIVEAHQAEVNCLAFNPFNEWIVATGSADRTVAL
HDIRKLDKVLHTCAHHMEEVFQIGWSPQNGAILASCGSD
RRLMVWDLSRIGDEQNPEDAEEAPPELLFIHGGHTSKIS
DFSWNPAEEWVIASVAEDNILQVWQMSEHIYNDDNDSPTA
222 The amino acid sequence of SEQ ID MAMAMGDENAADPVEEFNIWKKNTPFLYDLVITHALEWP
491. The conserved G-protein beta SLTVQWLPDRHQSSTADYSLQKMIVGTHTSEDEPNYLMI
WD-40 repeat domains are AEVQIPLQNSEDNIIGGFESTEAKVQIIQKINHEGEVNK
underlined. ARYMPQNSFVIATKTVSSDVYVFDYSKHPSKAPQERVCN
PELILKGHSNEGYGLSWSPLKEGYLLSGSNDAQICLWDI
NAAFGKKVLEANQIFKVHEGAVGDVSWHLKHEYLFGSVG
DDCHLLIWDMRTAAPNKPQQSVIAHQSEVNSLAFNPFNE
WLLATGSMDKTVKLFDLRKLSCSLHTFSNHTDQVFQIEW
SPMNETILASSGADRRLMVWDLARIGETPEDEEDGPPEL
LFVHGGHTSKISDFSWNLNDDRVIASVAEDNILQIWQMA
ENIYHDDEDML
223 The amino acid sequence of SEQ ID MGLFEPFRALGYITDGVPFAVQRRGIETFVTLSVGKAWQ
492. The conserved G-protein beta IYNCAKLIPVLVGPQMDKKIRALACWRDFTFAATGHDIA
WD-40 repeat domains are VFRRAHQVATWSGHKAKVTLLLSFGQHVLSVDLEGCLFI
underlined and the Trp-Asp (WD) WAVAEVNQNKPPIGQIQLGEKFSPSCIMHPDTYLNKVLI
repeats signature is in bold. GSEEGTLQLWNVNTRKKLYEFKGWGSSIRCCVSSPALDV
VGIGCSDGKIHVHNLRYDEEIVTFMHSTRGAVTALSFRT
DGQPLLAAGGSSGVISIWNLEKKKLQSVIKDAHDSSVCS
LHFFANEPVLMSSATDNSIKMWIFDTTDGEARLLKYRSG
HSAPPMCIRYYGKGRHILSAGQDRAFRIFSVIQDQQSRE
LSQGHVGKRAKKLKVKDEEIKLPPVIAFDAAEIRERDWC
NVVTCHLDDPCAYTWRLQNFVIGEHILKPCLEDPTPVKS
CSISACGNFAVLGTEGGWLERFNLQSGISRGTYIDIGEK
RQCAHNGAVVGLACDATNTLLISGGYNGDIKVWDFKGRE
LKFRWEIEVPLIKIVYHPGNGILATAADDMILRLFDVTA
MRLVRIFVGHMDRVTDLCFSGDGKWLLSSSMDGTIRVWD
IISSRQLNAMHMDSAVTALSLSPGMDMLATTHVGHNGIY
LWANRMIYSKATDIEPFISGKQVVKVSMPTVSSKRESEE
GDEKRTIVAESNVNKSDVSGSLIGDSYSAQLTPELVTLA
LLPKAQWQSLVNLDIIKMRNKPIEPPKKPEKAPFFLPSL
PTLSGERIFIPSSMNGDGDQDETRNDKTVFEARGKKLGG
ESLSFMQLLQSCAKIKDFTTFTNYLKGLSPSAVDMELRL
LQIVDNENISETEHSVELQGIGMLLDYFVNEVSCNNNFE
FVQALIRLFLKIHGETIRCQVSLQEKARKLLEIQSSTWE
RLDTSFQNARCMITFLSSSQF
224 The amino acid sequence of SEQ ID MIAAVCWVPKGVAKVLPDSAEPPTQEEIQELLKCNVVAE
493. The conserved G-protein beta SDDNEDSDEESEEMDTETDKNTDAVAKALAAANALGSQS
WD-40 repeat domains are SDFQRQHKVDDIANGLKELDMDHYDDEDEGIDIFGSGSL
underlined and the Trp-Asp (WD) GNCYYPANDMDPYLVEQDDDDEDEIEDMTIKPSDLIILS
repeats signature is in bold. ARNEDDVSHLEVWIYEEETEEGGSNMYVHHDIILPAFPL
SLAWLDCNLKGGEKGNFVAVGTMQPEIELWDLDVLDEVE
PAVVLGGAVKDEASGKTTKLKKKKKNKQAVNFKEGSHTD
AVLGLAWNMEYRNVLASASADKSVKIWDIVAEKCEHTMQ
PHTDKVQAVAWNPNQATVLLSGSFDRSVIMMDMRAPTHS
GIRWPVPADVESLAWDPHTDHSFMVSAEDGTVRGFDIRA
AASTADFDGKPMFILHAHDKAVCAISYNPAAPSLLTTGS
TDKMVKLWDITNNQPSCIASTNPNVGAVFSAAFSKNSPF
LLATGGSKGILHVWDTLDNSEVARRFGKFRPQN
225 The amino acid sequence of SEQ ID MIMDENEFCDIFSLRKRLCLLSSQEGEEEEELEAMSQLD
494. The conserved eukaryotic AGEFTVTGNEEVVAIAEDDVNTGILSQDLFSSQDYCTPS
protein kinase domain is QPQDSTDLDSKDKAPCPLSPVKSTIQRKRCRPELLSNPP
underlined. DSIQFSFQRLERVRSEESIQSSSQQLARVRSEVSSSDDF
KTPKITASGQKNYVSQSALALRARVMSPPCIKNPYLDEN
EELNEKIQRSTRRSPACVTPIQSGACLSRYRADFHELEE
IGRGNFSRVYKALNRLDGCCYAVKCSQSELRLDTERKVA
LMEVQSLAALGPHKNIVGYHTAWFENDHLYIQMELCDHN
LTTANDRGILRTDTDFLEAVYQIAQALEFIHGRGVAHLD
VKPENIYVRDGTYKLGDFGRATLINGTLHVEEGDARYMS
REILNDNYEHLDKVDMFSLGATFFELLMRKQYPGSGKRI
DRDTEIKIPILPGFSIYFQKLLQDLVSNDPGKRPSAKDV
LKNPIFNKVRGAKEV
226 The amino acid sequence of SEQ ID MLAPALEMEPVEPQSLKKLSFKSLKRALDLFSPVHGQIA
495. The conserved G-protein beta PPDPESKKMRISYKLNFEYGGGSGSEDQVPKRKESGAAQ
WD-40 repeat domains are NQGQQAAGASNALALPGPEGSKIPPMEKSQNALTVGPSL
underlined and the Trp-Asp (WD) RPQGLNDVGLHGKGTAIISASGSSDRNLSTSAIMERLPS
repeats signature is in bold. RWPRPVWHPPWKNYRVISGHLGWVRSIAFDPSNQWFCTG
SADRTIKIWDLASGRLKLTLTGHIEQIRGLAVSSKHTYM
FSAGDDKQVKCWDLEQNKVIRSYHGHLSGVYCLALHPTI
DILLTGGRDSVCRVWDIRSKMQIFALSGHDNTVCSVFAR
PTDPQVVTGSHDTTIKFWDLRHGKTMTTLTNHKKSVRAM
AQHPKENCFASASADNIKKFQLPRGEFLHNMLSQQKTII
NTMAVNEEGVMATGGDNGSLWFWDWKSGHNFQQAHTIVQ
PGSLESEAGIYALSYDLTGSRLVSCEADKTIKMWKEDEL
ATPETHPLNFKPPKDIRRF
227 The amino acid sequence of SEQ ID MEEAAKEQSAGSGKPKLLRYGLRSAAKPKEDKKEEQLHQ
496. The conserved G-protein beta PPPPPPPQQQAAPAPAPAATRSSTSGSAGGRDRRPQQQH
WD-40 repeat domains are AVDEKYARWKSLVPVLYDWLANHNLLWPSLSCRWGPQLE
underlined. QATYKNRQRLYISEQTDGSVPNTLVIANCEVVKPRVAAA
EHVSQFNEEARSPFIRKYKTIIHPGEVNRIRELPQNPNI
VATHTDSPDVLIWDVESQPNRHAVYGATASRPNLILTGH
QENAEFALAMCPAEPFVLSGGKDKTVVLWSIQDHITASA
TDQTTNKSPGSGGSIIKKTGEGNEETGNGPSVGPRGIYC
GHEDTVEDVAFCPSTAQEFCSVGDDSCLILWDARIGTNP
VAKVEKAHNGDLHCVDWNPHDNNLILTGSADNSVNMFDR
RNLTSNGVGSPVYKFEGHKAAVLCVQWSPDKPSVFGSSA
EDGLLNIWDYERVDKKVDRAPNAPAGLFFQHAGHRDKIV
DFHWNTADPWTMVSVSDDCDTAGGGGTLQIWRMSDLIYR
PEEEVLAELENFKAHVLECSKA
228 The amino acid sequence of SEQ ID MAKDEEEFRGEMEERLVNEEYKIWKKNTPFLYDLVITHA
497. The conserved G-protein beta LEWPSLTVQWLPDREEPPGKDYSVQKMILGTHTSDNEPN
WD-40 repeat domains are YLMLAQVQLPLEDAENDARQYDDERGEIGGFGCANGKVQ
underlined. VIQQINHDGEVNRARYMPQNPFIIATKTVSAEVYVFDYS
KHPSKPPQDGGCHPDLRLRGHNTEGYGLSWSPFKHGHLL
SGSDDAQICLWDINVPAKNKVLEAQQIFKVHEGVVEDVA
WHLRHEYLFGSVGDDRHLLIWDLRTSATNKPLHSVVAHQ
GEVNCLAFNPFNEWVLATGSADRTVKLFDLRKISSALHT
FSCHKEEVFQIGWSPKNETILASCSADRRLMVWDLSRID
EFQTPEDALDGPPELLFIHGGHTSKISDFSWNPCEDWVI
ASVAEDNILQIWQMAENIYHDEEDDMPPEEVV
229 The amino acid sequence of SEQ ID MGKYMRKGKGVGEVAVMEVSQGSLGVRTRARTLAAASSQ
498. The conserved cyclin- KDHRRLGASKSVTTKHQSSAPPASPCVESSMHTCYLELR
dependent kinase inhibitor domain SRKLEKFSRCYHSAHGATSHGESKRSLSLSEPSRLAVSE
is underined. EARVASDKSSHRVLQQQSSVAHSRNNSATFSHNAKPAKA
AQRKERRDDDHTSARPSEAPHEDEDGMEVEASFGENVMD
LDSRERRTRETTPSSYTRDVETMETPGSTTRPPSNAGRR
RFQTEGGHGTRNQFHVPTTNEIEEFFAGAEQQEQRRFTD
RYNYDPVSDSPLPGRFEWVRLRP
230 The amino acid sequence of SEQ ID MQNMEENVQSSWSLHGNKEICARYEILKRVSSGTYLDVY
499. The conserved RGRRKEDGLIVALKEVHDYQSSWREIEALQRLCGCPNVV
serine/threonine protein kinase RLYEVILEFLTSDLYSVIKSAKNKGENGIPEAEVKAWMI
domain is underlined, and the QILQGLANCHANWVIHRDLKPSNMLISAYGILKLADFGS
serine/threonine protein kinase MSFLKRAIYEVEYELPQEDILADAPGERLMDEDDSVKGV
active-site signature is in bold. WNEGEEDSSTAVETNFDDMAETANLDLSWKNEGDMVMQG
FTSGVGTRWYRAPDFLYGATIYGKEIDLWSLGCILGELL
ILEPLFSGTSNIDQLSRLVKVLGLQQKKNWPGCSNLPDY
RKLCFPGDGSPVGLKNHVPNCSDNMFSILERLVCYDPAA
RLNAKEIVENKYFVEDPYPVLTHELRVPSPLREENNFSE
DWAKWKDMEVDSDLENIDEFNVVHSSDGFCIKFS
231 The amino acid sequence of SEQ ID MADVPESLQQEKDEQGTDKNCCDGKFQKEIDIDDMEEEY
502. The conserved histone NESSIDDEEENLSDNVATNNMGTTPQGQACMAVTVEGIE
deacetylase family domain is HANSVGCGRNGREGSEEVTAAEDMGHVSIENIREQGRNR
underlined KSSEQLLALYEQEGLLEDDEDDDDVDWEPFEGVTVQMKW
YCTNCTMANSDDSVHCDSCGEHRNSDILRQGFLASPYLP
AESPSSSDVPDERLEESKCVMTTLTPSISPMIGVCCSSL
QSERRTVVGFDERMLLHSEIQMETYPHPERPDRLRAIAA
SLRAAGLFPGKCFSIPAREATCEELQTIHSLEHVNAVES
TSCGMLSHLSPDTYANEHSSLAARLAAGLCADLAKAIMT
GQAQNGFALVRPPGHHAGVKDSMGFCLHNNAAIAVSASR
VVGAKKVLIVDWDVHHGNGTQEIFEADQSVLYISLHRHG
EGFYPGSGAVTEVGSSKGEGYSVNIPWKCGGVGDNDYIF
AFQHAVLPIAEQFEPDLTIISAGFDAAKGDPLGRCEVTP
DGFAHMAQMLSCLSKGKMLVILEGGYNLRSISASATAVI
KVLLGDNPKALPIDIQPSKGGLQTLLEVFEIQSKYWSSL
KGHDQKLRSQWEAQYGSKKRKVIRKRHMHIVGGPVWWKW
GRKRVVYYHWFARVSSRKHL
232 The amino acid sequence of SEQ ID MASGAGAAGVVEWHQKPPNPKNPVVFFDVTIGTIPAGRI
503. The conserved cyclophilin- KMELFADIVPRTAENFRQFCTGEYRKAGIPIGYKGCHFH
type peptidyl-prolyl cis-trans RVIKDFMIQAGDFVKGDGSGCISIYGSKFEDENFIAKHT
isomerase family domain is GPGLLSMANSGPNTNGCQFFLTCAKCDWLDNKHVVFGRV
underlined and the cyclophilin- LGEGLLVLRKIENVQTGQHNRPKLPCVIAECGEM
type peptidyl-prolyl cis-trans
isomerase signature is in bold.
233 The amino acid sequence of SEQ ID MDHYYQDDFDYLVDDEMVDFADDVEDDVRTRRRSDIDSD
505. The conserved G-protein beta SENDFDSNNKSPDTTALQAKRGKDIQGIPWNRLNFTREK
WD-40 repeat domain is underlined. YRETRLQQYKNYENLPRPRRSRNLDKECTNFERGSSFYD
FRHNTRSVKATIVHFQLRNLVWATSKHNVYLMQNYSIMH
WSSLKQKGEEVLNVAGPIIPSVKHPGSSPQGLTRVQVSA
MSVKDNLVVAGGFQGELICKYLDKPGVSFCTKISHDENG
ITNAVEIYNDASGATRLMTANNDLAVRVFDTEKFTVLER
FSFPWSVNHTSVSPDGKLVAVLGDNADCLLADCKTGKTV
GTLRGHLDYSFAAAWHPDGYILATGNQDTTCRLWDVRKL
SSSLAVLKGRMGAIRSIRFSSDGRFMAMAEPADFVHLYD
TRQNYTKSQEIDLFGEIAGISFSPDTEAFFVGVADRTYG
SLLEFNRRRMNYYLDSIL
234 The amino acid sequence of SEQ ID MDCSGDEEEEQFFESLEEMLSPSDSGSEAADNETGCRNA
506. The conserved G-protein beta DARSKYEIWKRAPSSIQERRQRFLVRMGLANPSELGNQV
WD-40 repeat domains are NSTSAESTCSTETANIPNGIERLRENSGAVLRTAGSSGR
underlined. KTHCKNVINIGLREGSVRSSSSSNGTPDVGEDNGEFGGT
IFSRSGGTWECMCKIKNLDSGKEFVVDELGQDGLWNKLR
EVGTDRQLTMDEFERSLGLSPLVQELMRRESGVAQADCN
GVHHHDAEISSSKRRSWLKALKSAAYSMRRPKEDQSNYD
SERSGRRSGSFDVPWGKPQWTKVRHYRKRYKEFTALYMG
QEIEAHEGSIWTMKFSLDGRYLASAGQDCVIHVREVIES
MRTFGADTPDLYASSAYFSMNGLQELVPLSIEDHANKMK
RGKIIGSKKSSNSDCIVLPNKVFQLSEEPVCSFHGHLLD
VFDLSWSPSQYLLSSSMDKTVRLWKLGHESCLKVFSHND
IVTCIQFNPVDERYFISGSLDGKARIWSIPDRQVVDWSD
LREMVTAVCYTPDGQGGLVGSIKGSCRFYNTSGNKLQLE
NQLNVRSKKKKSSGKKITGFQFAPGGDSQKVLITSADSR
VRVYNGSELVCKYKGFRNTCSQISASFAPNGQHFVCASE
DSRVYIWNHESPRGSGARHEKSSWSHEHFLSQGVSVAIP
WSGMKLQPPVWNSPEFMLGQRHNLLSLQGGKDVGCQNGL
LSREAGEGQESETPLHYISQVSHSCGSQNMVDRDGQDDL
SRYSACISDSRLSSFMAFPESPGNPDDLNSKVFFSDSSS
KGSATWPEEKLPPTRKQSRSNSTSSHYDTLKTHLGNTIQ
GQSGASAAVAWGLVIVTAGHGGEIRSFQNYGLPVRL
235 The amino acid sequence of SEQ ID MPSIPAIGEFTVCEINRELLTTKDESDTQAKDAYAKILG
507. The conserved G-protein beta LVFPPISFQIEEGFGSASRQQFDQDLDREDTIVTPSTSE
WD-40 repeat domain is underlined. GTNALQEGGLLLKGVSVLKNILASSFGPIFSPNDTKVLK
KVELLQGISWHRHKHILAFISGSNQVTVHDFQDPEWRES
SLLVSESQRGIEALEWRPNGGTTLSVACRGGICIWSASY
PGSVAPVRSGVASFLGTSTRGSSVRWTLVDFLQIPGGKA
VTALSWSPTGRLLASASREDSSFTIWDVAQGVGTPLRRG
LGGISLLKWSPTGDYLFSAKPNGTFYLWETNTWTLEQWS
SSGGCVISATWGPDGRMLFMAFSESTTLGSLHFAGRPPS
LDAHLLPMELPEIGSITGGFGNIEKMAWDGCGERLAVSY
TGGDLMYVGLIAIYDTRRTPFISASLVGFIRGPGEQVKP
LAFAFHDKFKQGPLLSVCWSSGLCCTYPLIFRAH
236 The amino acid sequence of SEQ ID MEEENAKHTEETRQVQVRFTTKLQPALRVPTTSIAIPAH
508. The conserved G-protein beta LTRYGLSDIVNTLLGNDKPQPFDFLVESELVRTSLEKLL
WD-40 repeat domains are LIKGISAEKILNIEYILAVVPPKQEEPSLHDDWVSVVDG
underlined. SYPNFIFSGSFDSIGRIWKGEGLCTHVLEGHRDAITSAA
FIMPSDSSDSFINLATASKDRTLRLWQFKPNEHMTNGKM
VRPYKLLKGHTSSVQTVSACPRRNLICSGSWDCSIKIWQ
TAGEMDIESNAGSVKKRKLEDSTEQIISQIEASRTLEGH
SQCVSSVVWLEKDTIYSASWDHSVRSWDVETGVNSLTVG
CRKALHCLSIGGEGSALIAAGGADSVLRIWDPRMPGTFT
PILQLSSHKSWITACKWHPKSRHHLISASHDGTLKLWDV
RSKVPLTTLEAHKDKVLCADWWKEDCVISGGADSTLQIF
SNLNLT
237 The amino acid sequence of SEQ ID MNRLRSKRNHILELRLGQSEPEKEATLASNRSRGTNAPI
509. The conserved RING-type zinc VVEDDDDVVVSSPRSFALARSSVSQRSSRIPIVNEEDLE
finger is underlined. LRLGLAVTGRTSAEHNPRRRHGRVPPNKPIVLCDDAGEA
DQSSSKKRRTGQQLSSDVQSDESKEVKLTCAICISTMEE
ETSTICGHIFCKKCITNAIHRWKRCPTCRKKLAINNIIHR
IYISSSTG
238 The amino acid sequence of SEQ TD MEEPPPPAVLPSSEDTSIVSSHSFVNAPPTVPVGLDASI
510. The conserved G-protein beta PQISTPGINQPGLTIPVPPEAAPLTASLVAASAGMPPAV
WD-40 repeat domains are VPSFVRPAIVAHPSVMPPPSMPLAALPMPVASAVPVAAP
underlined and the splicing factor HFPPSTPNDNSITPSMPVPTPIVASSSVPPSVTIPGIAP
motif is in bold. LPFIAPIPVPSSRPVAPSPFMPPARPLGASVSVAMDVDN
TDEQDQDADNKGESPSSSPDHPEDPSAAEYEITEESRKV
RERQEQAIQELLLRRRAYALAVPTNDSSVRARLRRLNEP
ITLFGEREMERRDRLRALMAKLDAEGQLEKLMKVQEEEE
AAANVDAEEVQEMEGPQVYPFYTEGSQELLKARTEITKF
SLPRAVSRLQRARRKREDPDEDEDEELKCVLQQSAQINM
DCSEIGDDRPLSGCAFSSDGTLLATSAWSGVTKLWSVPN
INKVATLKGHTERVTDVAFSPTNCHLATACADRTAMLWN
SEGVLMKTYEGHLDRLARLAFHPSGLYLGTASFDKTWRL
WDVNTGIELLLQEGHSRSVYGIAFQCDGSLAATCGLDGL
ARIWDLRTGRSILALEGHVKPVLGIDFSPNGYHLATGSE
DHTCRIWDLRKRQSVYIIPAHSHLVSQVKFEPQEGYFLV
TASYDSTAKVWSARDFKSIKVLAGHEAKVTSVDITADGQ
YIATVSHDRTIKLWSSKNSTNDMNIG
239 The amino acid sequence of SEQ ID MKRAYKLQEFVAHASNVNCLKIGKKSSRVLVTGGEDHKV
511. The conserved G-protein beta NMWAIGKPNAILSLSGHSSAVESVTFDSAEALVVAGAAS
WD-40 repeat domains are GTIKLWDLEEAKIVRTLTGHRSNCISVDFHPFGEFFASG
underlined and the Trp-Asp (WD) SLDTNLKIWDIRRKGCIHTYKGHTRGVNSIRFSPDGRWV
repeats signature is in bold. VSGGEDNIVKLWDLTAGKLMHDFKCHEGQIQCMDFHPQE
FLLATGSADRTVKFWDLETFELIGSAGPETTGVRAMIFN
PDGRTLLTGLHESLKVFSWEPLRCYDAVDVGWSKLADLN
IHEGKLLGCSYNQSCVGVWVVDISRVGPYAAGNVSRTNG
HNEAKLASSGHPSVQQLDNNLKTNMARLSLSHSTESGIK
EPKTTTSLTTTEGLSSTPQRAGIAFSSKNLPASSGPPSY
VSTPKKNSTSRVQPTTNFQTLSRPDIVPVIVPRSNSLRP
ETTSDAKKEMNNFGRVVPSTVSTKSTDVIKSGSNRDESD
KIDSINQKRMTGNDKTDLNIARAEQHVSSRLDNTNTSSV
VCDGNQPAARWIGAAKFRRNSPVDPVVSPHDRSPTFPWS
ATDDGVTCQPDRQVTAPELSKRVVEPGRARALVASWETR
EKALTADTPVLVSGRPPTSPGVDMNSFIPRGSHGTSESD
LTVSDDNSAIEELMQQHNAFTSILQARLTKLQVIRRFWQ
RNDLKGAIDATGKMGDHSVSADVISVLIERSEIFTLDIC
TVILPLLTRLLQSETDRHLTVAMETLLVLVKTFGDVIRA
TISATPTIGVDLQAEQRLERCNLCYVELENIKQILVPLI
RRGGAVAKSAQELSLALQEV
240 The amino acid sequence of SEQ ID MAGSDENNPGVVGGAHVQEGLRVGAGKMGAGNVQQRRAL
512. The conserved cyclin N- and SNINSNIIGAPPYPCAVNKRVLSEKNVNSENDLLNAAHR
C-terminal family domains are PITRQFAAQMAYKQQLRPEENKRTTQSVSNPSKSEDCAI
underlined. LDVDDDKMADDFPVPMFVQHTEAMLEEIDRMEEVEMEDV
AEEPVTDIDSGDKENQLAVVEYIDDLYMFYQKAEASSCV
PPNYMDRQQDINERMRGILIDWLIEVHYKFELMDETLYL
TVNLIDRFLAVQPVVKKKLQLVGVTAMLLACKYEEVSVP
VVEDLILISDRAYSRKEVLEMERLMVNTLHFNMSVPTPY
VFMRRFLKAAQSDKKLELLSFFIIELSLVEYDMLKFPPS
LLAASAIYTALSTITRTKQWSTTCEWHTSYSEEQLLECA
RLMVTFHQRAGSGKLTGVHRKYSTSKFGHAARTEPANFL
LDFRL
241 The amino acid sequence of SEQ ID MQAPREGKSAAAIVGMGKYMKKSKAIPRDVSLLEASPRS
513. The conserved cyclin- PSATGVRTRAKTLASRRLRRASQRRPPPPAAAAAAAAPS
dependent kinase inhibitor domain LDASPCPFSYLQLRSRRLRRPRLAPSPEARIDEGPAGSG
is underlined. SRGSRDASCSARTASSSGGVEGEGACVGRGDRGNGGECV
RDAAVDASYGENDLEIEDRDRSTRESTPCSLIRDSNANT
PPGSTTRQQSSCTAHRTQMSILRSIPTSDEMEEFFAYAE
QRQQRSFIEKYNFDIVKDRPLPGRFEWVQVIP
242 The amino acid sequence of SEQ ID MDGHSSHLAAQNRSRGSQTPSPSHSAASASATSSIHLKR
514. The conserved GCN5-related N- KLSAANASAASAAAAAAAAAAAADDHAPPFPPSSISADT
acetyltransferase family domain is RDGALTSNDDLESISARGGGAGDDSDDDSDDEEEDDGDN
underlined and the bromodomain is DGGSSLRTFTAARLENVGPAAARNRKIKAESNATVKVEK
in bold. EDSAKDGGNGAGVGALGPAATSGAGSGSGTVPKEDAVKI
FTENLQASGAYSAREENLKREEEAGRLKFECLSNDGVDD
HMVWLIGLKNIFARQLPNMPKEYIVRLVMDRNHKSVMVI
RRNLVVGGITYRPYASQKFGEIAFCAIKADEQVKGYGTR
LMNHLKQHARDVDGLTHFLTYADNNAVGYFIKQGFTKEI
YLDKDRWHGYIKDYDGGILMECKIDPKLPYTDLSTMVRR
QRQAIDEKIRELSNCHIVYQGIDFQKRDAGVPQNTIKME
DIPGLREAGWTPDQWGYSRFRGLSDQKRLTFFIRQLLKV
LNDHSDAWPFKEPVDAREVPDYYDIIKDPMDLKTMTKRV
ESEQYYVTLEMFIADVKRMFANARTYNSPDTIYFKIATR
LEAHFQSKVQSNLQSGAGKIQQ
243 The amino acid sequence of SEQ ID MFNGMMDPELFKLAQEQMNRMSPAELAKIQQQMMSNPEL
515. The conserved TPR repeat MRMASESMKNMRPEDLRQAAEQLKHVRPEEMAEIGEKMA
domain is underlined NASPEEIAAVRARADAQMTYEINAAKILKKEGNELHSQG
RFKDASQKYLRAKNNLKGIPSSEGKNLLLACSLNLMSCY
LKTRQYEECIKEGSEALACEEKNLKAFYRRGQAYRELGQ
LKDAVSDLRKAHEISPDDETIAQVLRDTEESLTKEGGSA
PRGVVIEEITEEDETLASVNHESPSEYSEKRHQESEDAH
KGPINGDIMGQMTNSESLKALKGDPDAIRSFQNFISNAD
PTTLAAMGAGNAGEVSPDLIKTASSMIGKMSAEELQKMI
QLASSFPGENPYVTRNSDSNSNSFGNGSIPNVSPDMLKT
ASDMMSKMSPDDLQRMFEMASSSRGKDPSLDANHASSSS
GANLAANLNHILGESEPSSSYHIPSSSRNISSSPLSNFP
SSPGDMQEQIRNQMKDPAMRQMFTSMMKNMSPEMMANMG
KQFGLELSPEDAAKAQEAMSSLSPEMLDKMMRWADRAQR
GVETAKKTKNWLLGRPGMILAICMLLLAVILHRLGFIGS
244 The amino acid sequence of SEQ ID MIAAISWVPRGASKAVPEVAEPPSKEEIEEILKSGVVER
516. The conserved G-protein beta SGDSDGEEDDENMDAVASEKADEVSTALSAADALGRISK
WD-40 repeat domains are VTKAGSGFEDIADGLRELDMDNYDEEDEDVKLFSTGLGD
underlined. LYYPSNDMDPYLKDKDDDDDTEEIEDLSIKPMDSLIVCA
RTDDEVNLLEVYLLEPSLSDESNMYVHHEVVISEFPLCT
AWLDCPIKGGDKGNFIAVGSMEPAIEIWDLDIIDAVEPC
LVLGGQEELKKKKKKGKKASIKYKEGSHTDSVLGLAWNK
EFRNILASASADRQVKIWDVAAGKCNITMEHHTDKVQAV
AWNHHAPQVLLSGSFDHSVVMKDGRIPSHSGYRWSVTAD
VESLAWDPHSEHFFVVSLEDGTVRGFDVRAAISNSASQS
LPSFTLHAHEKAVSTISYNPAAPNLLATGSTDKMVKLWD
LSNNQPSCIASRNPKAGAVFSVSFSEDSPLLLAIGGSKG
RLEVWDTSSDAAVSRRFGKHGKPKTAEPGS
245 The amino acid sequence of SEQ ID MKFCKKYQEYMQGQEGKKLPGLGFKKLKKILKRCRRRDS
517. The conserved Zn-finger, RING LHSQKALQAVQNPRTCPAHCSVCDGSFFPSLLEEMSAVL
domain is underlined, and the SPX, GCFNKQAQKLLELHLASGFQKYLMWFKGKLRGNHVALIQ
N-terminal is in bold EGKDLVTYALINAIAIRKILKKYDKIHLSTQGQAFKSQV
QRMHMEILQSPWLCELIAFHINVRETKANSGKGHALFEG
CSLVVDDGKPSLSCELFDSIKLDIDLTCSICLDTVFDSV
SLTCGHIYCYMCACSAASVTIVDGLKAAEPKEKCPLCRE
ARVFEGAVHLDELNILLSRSCPEYWAERLQTERVERVRQ
AKEHWESQCRAFMGVE
246 The amino acid sequence of SEQ ID MVSTQSTRENPSIFFPPPLKPWLLPVVLSLSLSRQLGMA
518. The conserved G-protein beta AAAAASLPFKKNYRSSQALQQFYAGGPFAVSSDGSFIAC
WD-40 repeat domains are NCGDSIKIVDSSNASLRPSIDCGSDTITALSLSPDGKLL
underlined. FSAGHSRQIRVWDLSTSTCLRSWKGHDGPVMSMACPVSG
GLLATGGADRKVMVWDVDGGFCTHFFKGHDGVVSTVLFH
PDSNRSLLFSGSDDGTIRVWDLLAKKCASTLRGHDSTVT
SLAFSEDGLTLLAAGRDKVVSLWDLHNYACKKTIPMYEV
LESVCVIHSGTVLASQLGLDDQLKVTKESAQNIHFITVG
ERGILRIWKSEGSVCLFKQEHSDVTVISDEDDSRSGFTA
AVMLPLDQGLLCVTADQQFLFYYPEKHPEGIFSLTLCRR
LVGYNEEIVDMKFLGEEENFLAVATNLEQVRVYELASMS
CSYVLAGHTETVLCLDTCISSSGRTLIVTGSKDNSVRLW
DSESRHCIGVGVGHMGAVGAVAFSRKRQDFFVSGSSDRT
LKVWSLDGISEDGVDSTNLKAKAVVAAHDKDINSVAVAP
NDSLVCSGSQDRTACVWRLPDLVSVVVLKGHKRGIWSVE
FSPVDQCVLTASGDKTVKIWAISDGSCLKTFEGHVSSVL
RASFLTRGTQFVSCGADGLVKLWTVRTNECIATYDQHSD
KVWALAVGKKTEMLATGGSDAVVNLWYDSTASDKEDAFR
KEEEGVLKGQELENAVSDADYTKAIELALELRRPHKLFE
LFSELCRTREVGDRVERILSALSGEEVCLLLEYIREWNA
KPKLCHVAQSVLSQVFRILSPTEIVEIKGIGELLEGLIP
YSQRHFSRIDRLVRSTYLLDYTLTGMSVIEPEADRSAVN
DGSPDKSGLEKLEDGLLGENVGEEKIQNKEELESSAYKK
RKLPRSKDRSKKKSKNVVYADAAAISFRA
247 The amino acid sequence of SEQ ID MDSAPRRKSGGINLPSGMSETSLRLDGFSGSSSSFRAIS
519. The conserved G-protein beta NLTSPSKSSSISDRFIPCRSSSRLHTFGLVERGSPVKEG
WD-40 repeat domains are GNEAYSRLLKAELFGSDFGSLSPAGQGSPMSPSKNMLRF
underlined. KTESSGPNSPFSPSILRQDSGFSSEASTPPKPPRKVPKT
PHKVLDAPSLQDDFYLNLVDWSSQNTLAVGLGTCVYLWS
ASNSKVTKLCDLGPNDGVCAVQWTREGSYISIGTSLGQV
QIWDGTQCKRVRTMGGHQTRTGVLAWNSRILASGSRDRV
ILQHDLRVPNEFIGKLVGHKSEVCGLKWSHDDRELASGG
NDNQLLVWNQHSQQPVLKLTEHTAAVKAIAWSPHQNGLL
ASGGGTADRCIRFWNTTNGHQTSSVDTGSQVCNLAWSKN
VNELVSTHGYSQNQIMVWKYPSMAKVATLTGHSLRVLYL
AMSPDGQTIVTGAGDETLRFWNVFPSAKAPAPVKDTGLW
SLGRTHIR
248 The amino acid sequence of SEQ ID MEDEAEIYDGVRAQFPLTFGKQSKPQTSLESVHSATRRG
520. The conserved G-protein beta GPAPAPAPASSSSLPSTTSPSAAGGAGKSSGLPSLSSSS
WD-40 repeat domains are TAWLEGLRAGNPRAGREAGIGSRGGDGEDGGRAMIGPPR
underlined. PPPGFSANDDGGGEDDDDDGDGVMVGPPPPPPGNLGDGD
DDEEEEEAMIGPPRPPVVDSDEEEEEEEEENRYRLPLSN
EIVLKGHNKIVSALAVDPTGSRVLSGSYDYTVRMFDFQG
MNSRLSSFRDFEPVEGHQVRNLSWSPTADRFLCVTGSAQ
AKIYDRDGLTLGEFVKGDMYIRDLKNTKGHITGLTWGEW
HPKTKETILTSSEDGSLRIWDVNDFKSQKQVIKPKLARP
GRVPVTTCTWDREGKCIAGGIGDGSIQIWNLKPGWGSRP
DIHVEQAHADDITGLKFSSDGKILLTRSFDDSLKVWDLR
LMKNPLKVFEDLPNHYAQTNIACSPDEQLFLTGTSVERE
STIGGLLCFFDRSKLELVSRIGISPTCSVVQCAWHPRLN
QIFATSGDKSQGGTHVLYDPTLSERGALVCVARAPRKKS
VDDFELKPVIHNPHALPLFRDQPSRKRQREKILKDPLKS
HKPELPMNGPGHGGRVGASKGSLLTQYLLKQGGMIKETW
MDEDPREAILKHADAAEKNPKFTRAYAETQPDPVFAKSD
SEDEDK

TABLE 12
Eucalyptus in silico Data.
SEQ ConsID
ID eucSpp Family 1 2 3 4 5 6 7 8 9 10 11 12
1 3910 Cyclin- 0.25 0.11 0.20 0.73
dependant
protein
kinase
2 19213 Cyclin- 0.59 0.64
dependant
protein
kinase
3 36800 Cyclin- 0.11 0.36
dependant
protein
kinase
4 40260 Cyclin- 0.85
dependant
protein
kinase
5 41965 Cyclin- 0.35 0.86
dependant
protein
kinase
6 2906 Cyclin- 0.93 0.81
dependant
protein
kinase
7 1518 Cyclin- 0.08 0.28 0.08 0.06 0.11
dependant
protein
kinase
8 8078 Cyclin- 0.17 3.20
dependant
protein
kinase
9 9826 Cyclin- 0.36 0.23 0.15 0.04 0.24 0.43
dependant
protein
kinase
10 10364 Cyclin- 0.11 1.52 0.13
dependant
protein
kinase
11 11523 Cyclin- 0.15 0.06 0.15 2.40
dependant
protein
kinase
12 24358 Cyclin- 0.76 0.07 0.04 0.24
dependant
protein
kinase
13 39125 Cyclin- 0.23
dependant
protein
kinase
14 5362 Cyclin- 0.68 0.06 0.08 1.17
dependant
protein
kinase
15 44857 Cyclin- 0.68 0.06 0.08 1.17
dependant
protein
kinase
16 1743 Cyclin A 0.19 2.10 0.06 0.15
17 12405 Cyclin A 0.06 0.59 2.84
18 3739 Cyclin B 0.42 1.99 0.08 2.33
19 22338 Cyclin B 0.86
20 28605 Cyclin B 0.39 0.04 0.47
21 41006 Cyclin B 0.71
22 6643 Cyclin D 0.85 0.83 0.06 1.06 0.08 0.26
23 45338 Cyclin D 2.03
24 46486 Cyclin D 0.30
25 12070 Cyclin- 0.24 0.82 0.06 0.26 0.92
dependent
kinase
regulatory
subunit
26 6617 Histone 0.08 0.06 0.04 0.55 0.51 0.26
acetyltransferase
27 7827 Histone 2.27 0.11 0.04
acetyltransferase
28 8036 Histone 1.16
acetyltransferase
30 1596 Histone 0.17 0.16 0.08 2.98 0.88 0.26 0.98 0.71
deacetylase
31 5870 Histone 0.19 0.17 0.12 5.43
deacetylase
32 6901 Histone 1.21 0.08 2.01 1.16 0.08
deacetylase
33 6902 Histone 0.08 0.11 1.21 0.47
deacetylase
34 7440 Histone 0.48 1.23 0.15 0.22 0.48 0.20 2.02
deacetylase
35 8994 Histone 0.09 0.15
deacetylase
36 24580 Histone 0.42 1.22
deacetylase
37 37831 Histone 0.08 0.22 0.40 1.19 0.12
deacetylase
38 34958 MAT1 CDK- 0.15 0.23
activating
kinase
assembly
factor
39 22967 Peptidyl- 0.72 0.69
prolyl cis-
trans
isomerase
40 8599 Peptidyl- 0.46 0.08 0.50 0.17 0.51 0.28 3.01
prolyl cis-
trans
isomerase
41 9919 Peptidyl- 0.51 0.35 0.06 0.15 0.43 4.24
prolyl cis-
trans
isomerase
42 15820 Peptidyl- 0.04 6.78
prolyl cis-
trans
isomerase
43 8327 Peptidyl- 0.06 0.04 6.86
prolyl cis-
trans
isomerase
44 4604 Peptidyl- 0.68
prolyl cis-
trans
isomerase
45 966 Peptidyl- 0.59 1.02 0.54 0.69 0.50 0.93 0.59 0.95 18.65
prolyl cis-
trans
isomerase
46 1037 Peptidyl- 0.59
prolyl cis-
trans
isomerase
47 4603 Peptidyl- 0.17 0.17 1.24 0.04 0.34
prolyl cis-
trans
isomerase
48 5465 Peptidyl- 1.21 0.08 0.66 0.11 0.29 0.16 6.99
prolyl cis-
trans
isomerase
49 6571 Peptidyl- 0.51 0.08 0.41 0.08 1.14
prolyl cis-
trans
isomerase
50 6786 Peptidyl- 0.42 0.33 0.06 0.41 0.04
prolyl cis-
trans
isomerase
51 7057 Peptidyl- 0.42 0.11 0.04
prolyl cis-
trans
isomerase
52 8670 Peptidyl- 1.56 0.39 0.20 0.12
prolyl cis-
trans
isomerase
53 9137 Peptidyl- 0.04 0.59
prolyl cis-
trans
isomerase
54 10285 Peptidyl- 0.60 1.16 0.04 0.04 0.45
prolyl cis-
trans
isomerase
55 10600 Peptidyl- 0.16 0.17 0.06 0.46
prolyl cis-
trans
isomerase
56 11551 Peptidyl- 0.08 0.06 0.04 0.08 1.89
prolyl cis-
trans
isomerase
57 20743 Peptidyl- 0.76
prolyl cis-
trans
isomerase
58 23739 Peptidyl- 0.59
prolyl cis-
trans
isomerase
60 31985 Peptidyl- 1.99
prolyl cis-
trans
isomerase
61 32025 Peptidyl- 0.99
prolyl cis-
trans
isomerase
62 32173 Peptidyl- 1.99
prolyl cis-
trans
isomerase
64 9143 Retinoblastoma 0.90 0.15
related
protein
65 349 WD40 repeat 0.24 0.34 0.08 0.17 0.22 0.33 0.08 0.25 2.24
protein
66 575 WD40 repeat 0.25 0.94 0.31 0.34 0.11 0.16 0.47 1.87
protein
67 804 WD40 repeat 0.15 0.34 0.39 0.33 0.39 1.82
protein
68 805 WD40 repeat 0.97 0.51 4.66 0.23 0.17 0.77 0.33 1.07 0.24 4.43
protein
69 806 WD40 repeat 0.83 0.04
protein
70 2248 WD40 repeat 0.08 0.08 1.92 0.06 0.08 0.91
protein
71 3203 WD40 repeat 0.34 0.18 0.15 0.17 0.11 0.30 0.04 0.72
protein
72 3209 WD40 repeat 0.08 0.15 0.17 0.12 0.61
protein
73 4429 WD40 repeat 0.08 1.16 0.08 0.13
protein
74 4607 WD40 repeat 0.76 0.54 0.06 0.07
protein
75 4682 WD40 repeat 0.08 0.28 0.23 1.13 0.08 0.12
protein
76 5786 WD40 repeat 0.08 0.06 0.46 0.08 0.13
protein
77 5887 WD40 repeat 1.61 1.23 0.08 0.06 0.15 0.28 1.41
protein
78 5981 WD40 repeat 0.08 0.37
protein
79 6766 WD40 repeat 0.24 0.08 1.31 0.51 0.06 0.74 0.51 0.28
protein
80 6769 WD40 repeat 0.93 0.17 0.12 2.28
protein
81 6907 WD40 repeat 0.25 0.17 0.06 0.45 0.32 0.47 1.67
protein
82 7518 WD40 repeat 0.91 0.28 0.15 0.55 0.59
protein
83 7717 WD40 repeat 0.47 0.38
protein
84 7718 WD40 repeat 0.24 1.88 0.08 0.22 0.04 0.92
protein
85 7741 WD40 repeat 1.42 0.11 0.47
protein
86 7884 WD40 repeat 1.33 0.15 0.24
protein
87 8258 WD40 repeat 0.72 0.19 0.23 0.87 0.15 0.08 0.08
protein
88 8465 WD40 repeat 0.47 0.08 1.75
protein
89 8616 WD40 repeat 0.57 0.08 0.69 0.16 0.13
protein
90 8690 WD40 repeat 0.26 0.08 0.35 1.39 0.34 0.32 2.13 0.80
protein
91 8708 WD40 repeat 0.57 0.04
protein
92 8850 WD40 repeat 0.09 0.06 0.27 2.03
protein
93 9072 WD40 repeat 1.21 0.17 0.48
protein
94 9465 WD40 repeat 0.24 0.72 0.33 0.15
protein
95 9472 WD40 repeat 0.36 1.99 0.11 0.61 6.90
protein
96 9550 WD40 repeat 0.90 0.11 1.78
protein
97 10284 WD40 repeat 0.24 0.08 1.82 1.22 0.16 0.47 0.28
protein
98 10595 WD40 repeat 0.16 0.17 0.11 6.52 0.85
protein
99 10657 WD40 repeat 0.06 0.12
protein
100 12636 WD40 repeat 0.06 0.65
protein
101 12748 WD40 repeat 1.50 0.08 0.06 1.67 0.04 0.38
protein
102 12879 WD40 repeat 0.08 0.33 0.06 0.04 0.08 2.00
protein
103 15515 WD40 repeat 0.35 0.30
protein
104 15724 WD40 repeat 0.25 0.33 0.15 0.47 0.04 0.39
protein
105 16167 WD40 repeat 0.24 0.52
protein
106 16633 WD40 repeat 1.96 0.12 0.42
protein
107 17485 WD40 repeat 0.65
protein
108 18007 WD40 repeat 0.12
protein
109 20775 WD40 repeat 0.17 0.08
protein
110 23132 WD40 repeat 2.42
protein
111 23569 WD40 repeat 0.91 0.91
protein
112 23611 WD40 repeat 4.15
protein
113 24934 WD40 repeat 0.34 0.04
protein
114 25546 WD40 repeat 0.09
protein
115 30134 WD40 repeat 0.07
protein
116 31787 WD40 repeat 0.19 1.19
protein
117 34435 WD40 repeat 0.35 0.08
protein
118 34452 WD40 repeat 1.44 0.20 0.25
protein
119 35789 WD40 repeat 0.20
protein
120 35804 WD40 repeat 0.19 0.27 0.08
protein
121 43057 WD40 repeat 0.30 0.57
protein
122 46741 WD40 repeat 0.46
protein
123 47161 WD40 repeat 1.78
protein
235 6366 WD40 repeat 0.08 0.68 0.23 0.93 0.11 0.36 0.83 0.24 0.94
protein
236 17378 WD40 repeat 0.65 0.12 0.08
protein
252 45414 Cyclin B 3.13
253 44328 Cyclin- 0.38
dependant
kinase
inhibitor
254 15615 Histone 0.22 0.04
acetyltransferase
255 17239 Peptidyl- 0.08 0.50 0.08
prolyl cis-
trans
isomerase
256 18643 WD40 repeat 0.04 0.90
protein
257 19127 WD40 repeat 0.04 0.89
protein
258 22624 WD40 repeat 1.16
protein
259 32424 WD40 repeat 0.50
protein
260 37472 WD40 repeat 0.08 0.17
protein
In Table 12, the following numbers 1-12 represent the following tissues:
1 is bud reproductive;
2 is bud vegetative;
3 is cambium;
4 is fruit;
5 is leaf 6 is phloem;
7 is reproductive;
8 is root;
9 is sap vegetative;
10 is stem;
11 is whole; and
12 is xylem.

TABLE 13
Pine in silico data.
ConsID
SEQ pinus
ID Radiata Family 1 2 3 4 5 6 7 8 9 10 11 12
124 1766 Cyclin- 1.02 0.05 1.58 0.15 0.22 0.22 0.18 2.16 4.91
dependant
protein
kinase
125 2927 Cyclin- 0.16 0.19 0.11 0.14 0.04 0.36 0.38 0.17
dependant
protein
kinase
126 7642 Cyclin- 0.22 0.21 0.05 0.07
dependant
protein
kinase
127 13714 Cyclin- 0.11 0.11
dependant
protein
kinase
128 16332 Cyclin- 0.54 0.26 0.14 0.04 0.91
dependant
protein
kinase
129 21677 Cyclin- 0.05 0.14 0.17
dependant
protein
kinase
130 27562 Cyclin- 0.41
dependant
protein
kinase
131 1504 Cyclin- 0.16 0.36 0.35 0.21 0.54 0.09 0.65
dependant
protein
kinase
132 15211 Cyclin- 0.13 0.15 0.19 0.19
dependant
protein
kinase
133 20421 Cyclin- 0.04 0.05 0.95
dependant
protein
kinase
134 3187 Cyclin- 0.34 0.15 0.04 0.18 0.38
dependant
protein
kinase
135 15661 Cyclin- 0.04 0.13
dependant
protein
kinase
136 13874 Cyclin A 0.31 0.27 0.15 0.05
137 14615 Cyclin A 0.16 0.15
138 4578 Cyclin B 0.47 0.14 0.13 0.22 0.74 0.38
139 23387 Cyclin B 0.29 0.26 0.17
140 6970 Cyclin D 0.14 0.27 0.04
141 10322 Cyclin D 0.16 0.19 0.06 0.14 1.12 1.36
142 22721 Cyclin D 0.27 0.36
143 23407 Cyclin D 0.15 0.26 0.31
144 1945 Cyclin- 0.28 0.55 0.41 0.16 1.62 5.02 0.22 0.72 0.39 3.06
dependent
kinase
regulatory
subunit
145 8233 Cyclin- 0.21
dependent
kinase
regulatory
subunit
146 8234 Cyclin- 0.16 0.11
dependent
kinase
regulatory
subunit
147 22054 Cyclin- 0.05 0.22 0.18
dependent
kinase
regulatory
subunit
148 12137 Histone 0.06 1.51 0.19
acetyltransferase
149 12582 Histone 0.64 0.15 1.09 0.33 0.63
acetyltransferase
150 15285 Histone 0.21 0.12 0.70 0.14
acetyltransferase
151 17229 Histone 0.94 0.16
acetyltransferase
152 20724 Histone 0.04 0.19 0.19
acetyltransferase
153 4555 Histone 0.16 0.14 0.97 0.14 0.89 0.89
deacetylase
154 4556 Histone 0.14
deacetylase
155 5729 Histone 0.31 0.28 0.22 0.58 0.22 2.00 0.48 0.07 0.04 2.73 1.46
deacetylase
156 7395 Histone 0.14 0.14 0.19 0.93 0.04 0.14 1.33
deacetylase
157 9503 Histone 0.11 0.14
deacetylase
158 11283 Histone 0.19 0.15 0.96 1.35
deacetylase
159 12322 Histone 0.16 0.06 0.11 0.04 0.05 0.29
deacetylase
161 23236 Histone 0.13 0.11
deacetylase
162 171 Peptidyl- 0.07 0.46
prolyl
cis-trans
isomerase
163 172 Peptidyl- 0.19 0.11 0.18 0.11 0.46
prolyl
cis-trans
isomerase
164 1480 Peptidyl- 2.51 4.20 0.88 2.97 1.58 3.53 7.36 1.33 2.74 0.72 6.62 10.14
prolyl
cis-trans
isomerase
168 1692 Peptidyl- 0.16 0.22 0.65 0.61 0.26 0.29 0.18 1.28 0.34
prolyl
cis-trans
isomerase
169 5313 Peptidyl- 0.14 0.07 0.37 0.17
prolyl
cis-trans
isomerase
170 6362 Peptidyl- 0.14 0.33 0.05 0.06 0.60 0.04 2.92 0.68
prolyl
cis-trans
isomerase
171 6493 Peptidyl- 0.42 0.11 0.21 0.11 0.04 0.25 0.32
prolyl
cis-trans
isomerase
172 6983 Peptidyl- 0.61 0.13 0.04
prolyl
cis-trans
isomerase
174 7665 Peptidyl- 0.11 0.39 0.05 0.62 0.25
prolyl
cis-trans
isomerase
175 12196 Peptidyl- 0.19 0.15 0.14 0.16
prolyl
cis-trans
isomerase
176 13382 Peptidyl- 0.25 0.06 0.07 0.04 0.87 0.15
prolyl
cis-trans
isomerase
177 16461 Peptidyl- 0.19 0.15 0.15 0.04 0.04 0.74
prolyl
cis-trans
isomerase
178 17611 Peptidyl- 0.24 0.11 0.27 0.41 0.99
prolyl
cis-trans
isomerase
179 19776 Peptidyl- 0.13 0.07 0.16 0.05 0.61
prolyl
cis-trans
isomerase
180 20659 Peptidyl- 0.15 0.19
prolyl
cis-trans
isomerase
181 22559 Peptidyl- 0.11 0.14 0.20
prolyl
cis-trans
isomerase
182 24188 Peptidyl- 0.23
prolyl
cis-trans
isomerase
183 27973 Peptidyl- 1.01
prolyl
cis-trans
isomerase
184 1353 WD40 0.44 0.05 0.73 0.11 1.07 0.70 1.32
repeat
protein
185 1978 WD40 0.14 0.05 0.44 0.11 0.21 0.27 0.36 1.46 0.82
repeat
protein
186 2810 WD40 0.42 0.79 0.11 0.39 0.27 0.36 1.69 1.03
repeat
protein
187 2811 WD40 0.14 0.09 0.14
repeat
protein
188 2812 WD40 0.15 0.18 0.04 0.16
repeat
protein
189 3514 WD40 0.63 0.06 0.14 0.18 0.48 0.56
repeat
protein
190 4104 WD40 0.14 0.25 0.27 0.37 0.36 0.19 0.18 0.39 0.53
repeat
protein
191 5595 WD40 0.14 0.25 0.15 0.14 0.07 0.23
repeat
protein
192 5754 WD40 0.31 0.14 0.06 0.07 0.16 0.10 0.16
repeat
protein
193 6463 WD40 0.16 0.56 0.22 0.43 0.81 0.53 0.21 0.08 1.00 0.70
repeat
protein
194 6665 WD40 0.31 0.28 0.45 0.44 0.96 0.07 3.37 2.68
repeat
protein
195 6750 WD40 0.14 0.59 0.05 0.37 0.42 0.04 0.18 0.52
repeat
protein
196 7030 WD40 0.31 0.40 0.54 0.45 0.37 0.07 1.58 3.41
repeat
protein
197 7854 WD40 0.11 0.14 0.05
repeat
protein
198 7917 WD40 0.22 0.39 0.13 0.15 0.18 0.56
repeat
protein
199 7989 WD40 0.11 0.04 0.11
repeat
protein
200 8506 WD40 0.47 0.33 0.11 0.86 0.19 1.28 0.04 1.23 3.12
repeat
protein
201 8692 WD40 0.21 0.06 0.11 0.15 0.10 0.87
repeat
protein
202 8693 WD40 0.11 0.80 0.25 0.14 0.18 0.53 0.31
repeat
protein
203 9170 WD40 0.16 0.11 0.05 0.05
repeat
protein
204 9408 WD40 0.33 0.05 0.41 0.15 0.14 0.41 0.33
repeat
protein
205 9522 WD40 0.11 0.18
repeat
protein
206 9734 WD40 0.11 0.05 0.11 0.15 0.07 0.25 0.11
repeat
protein
207 9815 WD40 0.11 0.18 0.14
repeat
protein
208 10670 WD40 0.40 0.16 0.11 0.16 0.34 0.31
repeat
protein
209 11297 WD40 0.53 0.15 0.16 0.05
repeat
protein
210 13098 WD40 0.19 0.11 0.54 0.31 0.14 0.26 1.85 0.14
repeat
protein
211 13172 WD40 0.04
repeat
protein
212 13589 WD40 0.11 0.06 0.21 0.05 0.37
repeat
protein
213 13608 WD40 0.11 0.04 0.59 0.33
repeat
protein
214 14299 WD40 0.16 0.05 1.09 0.38
repeat
protein
215 14498 WD40 0.21 0.44 0.30
repeat
protein
216 14548 WD40 0.16 0.11 0.11 0.82
repeat
protein
217 14610 WD40 0.16 0.27
repeat
protein
218 16090 WD40 0.43 0.04 0.37 0.85
repeat
protein
219 16722 WD40 0.10
repeat
protein
220 16785 WD40 0.05 0.13 0.38 0.50
repeat
protein
221 17094 WD40 0.29 0.15 0.24 0.81
repeat
protein
222 17527 WD40 0.04 0.10
repeat
protein
223 17591 WD40 0.14 0.10
repeat
protein
224 17769 WD40 0.39
repeat
protein
225 18047 WD40 0.05 0.22 0.98 0.15 2.68 0.07 0.19 0.80
repeat
protein
226 18414 WD40 0.16 0.15 0.34 0.23 0.19
repeat
protein
227 18986 WD40 0.41 0.15
repeat
protein
228 19479 WD40 0.05 0.28 0.32
repeat
protein
229 20144 WD40 0.43 0.29 0.05
repeat
protein
230 22480 WD40 0.15 0.27
repeat
protein
231 23079 WD40 0.13 0.04
repeat
protein
232 26739 WD40 0.15 0.18
repeat
protein
233 26951 WD40 0.21 0.20
repeat
protein
234 26529 WEE1-like 0.04 0.18
protein
237 888 WD40 0.11 0.18
repeat
protein
238 14166 Cyclin- 0.16 0.05 0.05
dependant
kinase
inhibitor
239 3189 Cyclin- 0.06
dependant
protein
kinase
240 9356 Histone 0.11 0.22 0.46
acetyltransferase
241 65 Histone 0.16 0.22 0.27 0.22 0.24 0.34
deacetylase
242 14197 Histone 0.16 0.33 0.05
deacetylase
243 9081 Peptidyl- 0.11 0.05 0.29 0.26 0.69
prolyl
cis-trans
isomerase
244 13417 Peptidyl- 0.06 0.59
prolyl
cis-trans
isomerase
245 5755 WD40 0.16
repeat
protein
246 6670 WD40 0.14 0.05
repeat
protein
247 7027 WD40 0.14 0.15 1.30 0.15
repeat
protein
248 7276 WD40 0.14 0.11 0.05
repeat
protein
249 7390 WD40 0.31 0.14 0.11 0.44 1.29 0.38
repeat
protein
250 12648 WD40 0.05 0.06 0.05 0.94
repeat
protein
251 13171 WD40 0.19 0.63 0.19 0.34
repeat
protein
Table 13, the following numbers 1-12 represent the following tissues:
1 is bud reproductive;
2 is bud vegetative;
3 is callus;
4 is cambium;
5 is meristem vegetative;
6 is phloem;
7 is reproductive female;
8 is reproductive male;
9 is root;
10 is vascular;
11 is whole; and
12 is xylem.

TABLE 14
Oligo Table.
Oligo
SEQ
ID Oligo ID Microarray Oligo Seq
521 Euc_003910_O_4 GATTTTAAGTAACTCAATTAGCAGTTCCAACATTAAACCATTATTATTACCCCTTTTATC
522 Euc_019213_O_1 CTCAAAAAGTACTTGGATGCGTGCGGTGACAACGGACTCGAACCGTACACTGTCAAATCT
523 Euc_036800_O_4 TTGTCAAGTTGCAGGACGTAGTGCACAGTGAGAGGCGTCTATATCTAGTTTTTGAGTACT
524 Euc_040260_O_1 GAAGAAATTATATAACTAGATACAAGGTTAGCTAGGTATATAATAGCGGTACAAGTCTTT
525 Euc_041965_O_1 GGACAAATCAAGTAGAACTTCTCTCGGCAGCATCAGTTTTTCTAATCCATGCCTTGTTGC
526 Euc_002906_O_1 CTCAGTTCTGATAATGCCTCGGATATATGGCCGAGTGTTCGCTGGACGGCCTCTTATGTT
527 Euc_001518_O_3 GGAGATTCTGAACTGCAACAGCTCCTACACATTTTCAGACTGTTGGGTACTCCAAATGAA
528 Euc_008078_O_2 GACTGGTAAAATCGTTGCACTAAAAAAGGTCCGGTTTGACAACTTGGAACCTGAAAGCGT
529 Euc_009826_O_4 AAACACCAATCTATCAACACTGTCGAGTTTAGTCACTAGTAGAACCGGAGATAACAAACA
530 Euc_010364_O_1 CTATGATCCTGAGCGCAAGCAAGTTATGACCAATAGAGTCGTTACACTATGGTACCGAGC
531 Euc_011523_O_1 TGTTGTGAAGGTAGTTATAGCCATCGATTAGACAGTGATTAAAGTAGTACCCGTGCCAAT
532 Euc_024358_O_2 CCACATACAAGAGTTGTTACGCTACACATCCTATACCATCAAAGGAACGTTGGAATGCCA
533 Euc_039125_O_3 TATGATCGACACAAGCATTTTGTGTTGGAGCCTCAGCTAATTGTATGTCATCGAGTACTT
534 Euc_005362_O_3 AAAATTTTTGCTACGGATAATGTTGTGAGGCGAGGCAGTCGAAATTACGGAGGTTGACTT
535 Euc_044857_O_1 ATGCAGGGATCAAATTTGTGAGTACTACGTAAAATTTTGCTACGGAGGCGAGGCAGTCGA
536 Euc_001743_O_1 GAAGAATACAGGCTCGTACCTGATACACTGTACCTGACTGTTAACTACATAGATCGGTAT
537 Euc_012405_O_1 TCCACCCTAAATGCGATACGTGAAAAGTATAGACAACAGAAGGTAAACTATTCATTACTG
538 Euc_003739_O_2 AGGCTTCTAGTTGCGTTCCCCCAAACTACATGGATCGGCAGCAGGATATTAATGAGCGGA
539 Euc_022338_O_2 GAGAAAAATGACAGATTGATATCGATGATGATGACTGTCGTGTCATCAGTAGTGTGCTTT
540 Euc_028605_O_5 TTTCCAATTGTAGTTCGTCTTTTATTGTAACAATAAATTGATAGATACTGATTCGAAATA
541 Euc_041006_O_1 ACATTTATGCTAACTATAGGAGAACGGAGAATTGTAGCTGCGTCTCTGCTAACTACATGG
542 Euc_006643_O_1 TTCTGGCTTAAAGGCTATTCTTTGTGCACAATGACCTGAGGGAGGTCTCGACAGACCACT
543 Euc_045338_O_1 TTCATCCGGGTCCTGGTTATCATACTCTTATATATGTTGGGGAATAACGGTTCATATGTT
544 Euc_046486_O_3 GGGTGTGCTTAATAGTTCTTATTAGTCTTAGCTTATTATCTTTGATTGGACATGCTATAA
545 Euc_012070_O_2 CTTGCTAAGTAGACATGTTATATTTCTAATGCTTTGAGAACAATATTACAGTATAATTAG
546 Euc_006617_O_2 AATCATCGACTAGACCGATGGTCAAAGTGGTAATCATGTAATTAAACGCGTTTGTCATTG
547 Euc_007827_O_2 ATGGAAAAATCTATGGATATGAAGGATTGAAGATATCCGTCTGGGTAAGCTGTGTATCAT
548 Euc_008036_O_3 TTATGATTTGAGAAAACCCTTGCAGGCTGCGATTTGCGGATCATGACAGCATAGTTTTGC
549 Euc_001596_O_2 GTTTTGTTGTGAGGGCTTGGTAGGTTTTCATTATATTGTAATGTCGACGACAGAGATTTT
550 Euc_005870_O_3 CCAATTAATGTTACTGCTCAAGCTGACGTACCTGCGAAAAAAGCACCAGTGACTGCTAAT
551 Euc_006901_O_3 TGATGTCAAAACGTAGCTCTTTTTTGTGTGAGCTATCCTGCTAAATTAAACCTCAGCAAA
552 Euc_006902_O_1 ACATGAGTATTATGAATACTTCGGTCCTGACTATACACTTCATGTTGCTCCGAGTAACAT
553 Euc_007440_O_2 GAATTGGCGATCACAATCTACTGTAGTCAATACTCAAGTGGGAGGTGTAAATAGATTCCA
554 Euc_008994_O_1 GATCATGTGTAATCAGTATATCAGGTTAGAAACAGTACTCTTGAGCTTAGCGGGCACTGT
555 Euc_024580_O_2 TCCTGTGAAGGTGGTCGACTCAATCAAAAGGTACCTTGTAGATAAGGTACCTTTTCTCAA
556 Euc_037831_O_5 GCATTTTATACGACGGATAGAGTCATGACCGTATCTTTCCATAAGTTTGGGGACTTCTTC
557 Euc_034958_O_3 CCTCGTTTCTTTGCGGTTCGGACGCATCATGGATGTATCTCCAAAGAGTAATCTGTCGAT
558 Euc_022967_O_2 AATTCAGATCTATTAGTGAAAGTTGGCATGAGTCTCAATCTTAGGGGAATACAGTACGGA
559 Euc_008599_O_3 TGATATGAGTATCATAACTCGGATGGTGACAACTTTGTACTACGGTCGGCACCGGTAGAT
560 Euc_009919_O_1 CATATACAATCTTAGTGGATTAGCTGAGGTCGAAACTGACAAGAGTGATCGCCCGTTGGA
561 Euc_015820_O_2 CATGGCTAACGCTGGCCCTAGCACTAATGGGAGCCAATTTTTCATATGCACTGTAAAGAC
562 Euc_008327_O_2 AACAAAGTCTACCTTGACATTAGCATCGGTAACCCTGTCGGGAAACTAGTCGGAAGAATT
563 Euc_004604_O_2 TGTGCTTGGATATACTGTATAAGCATTCTATATTATGCTTGTTGGCTTCGTTTTGAGGGA
564 Euc_000966_O_1 TTAACGTCGACCGCTTCTCTGCCCCTTGAATTTTCCCGAGAAAACCAGGAACCTGCCAAA
565 Euc_001037_O_1 TGTTGAATACGATGTATTATAATGTTGGTGTCTTGGTGAAATACAGAATTATGCTTGCGT
566 Euc_004603_O_2 ATCGCTGTGGCTGATCTCGTCGCTCCGGCTTTTCATAAAAATCATGGCTGAGGCAATCGA
567 Euc_005465_O_2 CTCGCAACCCTATATCTCGCTCAGGCGAAGAAGTCTGAGGATTTGAAAGAGGTGACTCAC
568 Euc_006571_O_1 TGTTTTTGGGTACACGCAGTTAGGATAACTAGCATGAAAGCCCGATCCCGCATATACAGG
569 Euc_006786_O_2 GAGGACTAGCCGGAACTTCATCGAACTCTCTCGGAGGGGTTACTACGATAACGTCAAGTT
570 Euc_007057_O_1 GATGGCTAGCACTGTGTAGAAAGGTGAATTTAAAGTACTTGTCTACACTGCTTATTAAAT
571 Euc_008670_O_2 TGAGACTGTCTTGGCGTGTATTTTGGAATAAACTATTATCACGTTTTGTTAAATATAATA
572 Euc_009137_O_3 TTACAAAATGGCTCTCAGAAAGTATCGAAAGGCCCTGCGCTATCTGGATATCTGCTGGGA
573 Euc_010285_O_2 AATTTTATGTTTGCTACTGCTTAGTGCTTAATGGACTTGCGTAGGTATTCAAATTACAGA
574 Euc_010600_O_1 TGGAACCGTGGTATCGGCTGACGTTATCCGTGATTTTAAGACTGGAGATAGTTTATGCTA
575 Euc_011551_O_2 CTTTGATGTATCCTCAGTGTACTGCTTTTAGCTATGTATAGATCGAGTCAACTCATTGAA
576 Euc_020743_O_3 TTTTTATTATTTACCTTCGCCTTTACGCTGCATACGTTAATAGGTTATTATTTCCTTCAA
577 Euc_023739_O_1 ATTTGTCCATGACAATCGTAGTCGAAGACACGATACGCTCTTAGATGGTACGGAAATCTG
578 Euc_031985_O_2 TGAATAGAGATAACTTTTCTGAGTGTGAATTGGATATTACGTTGCAAATAGCCGAATGAA
579 Euc_032025_O_2 GCTTTAGGTTAGGGATCCCTGTAAGCTGATGATAGATATTGGAGATGGTACTTGTAAGAT
580 Euc_032173_O_1 TGTTGTGTTTGGAAAGGTGCTGTCTGGGATGGATGTTGTCCACAAGATTGAGGCTGAAGG
581 Euc_009143_O_1 GGAAAGCGGGGAATGAGCATGTGGATATTATCTCTTTCTACAATGAAATATTCATTCCTT
582 Euc_000349_O_1 CATCAGGACGTTGACTCTAATTAAGACATATGTGACAGAGCGCCCTGTTAATGCGGTTAC
583 Euc_000575_O_2 CTTTAGGTTTGATCTGTCTGTTTTGTCTATCCTGCGAGTTTCGAGCATGTGCGTGTGTGA
584 Euc_000804_O_1 CAGCCCCAATAGATACTGGCTCTGTGCCGCTACTGAGAACAGTATTAAAATCTGGGACCT
585 Euc_000805_O_2 AAGAATGAAGCTGATATGAGTGATGGAACTACGGGGGCCATGAGCTCAAATAAGAAGGTC
586 Euc_000806_O_1 TGACTACAATTAGCACCTCACCATTATCGAACTGTATAATTGTGCTTGCCTGCTATTATT
587 Euc_002248_O_4 TTGAAGCGGAAATATATATTTATGCTACTACATAAGTAATGTACTACTTGACAAGATGAG
588 Euc_003203_O_1 TACTCGATGTGGTATAGAATTTATCCAATGTACTCCTAAATGTAGATACATCGTGTATTG
589 Euc_003209_O_2 GCTTCGTCTGATACCACTATCAAGATAATAGGCGTGAGCAATAGCTCTGGATCACAGCAC
590 Euc_004429_O_4 GGTCGGCTTGCTAGTGTATCTGATGACAAGAGCATATCACTCTATGATTACTCATGAAGG
591 Euc_004607_O_3 GAAAGGAGAAAAGCATGGAGATCGATCTCGGAAACCTCGCATTCGACGTCGATTTTCATC
592 Euc_004682_O_1 GATTCAGTACCCGGATTCGCAAGTCAACCGGTTGGAGATAACTCCACATAAGCGGTACCT
593 Euc_005786_O_1 TTCCATGTATCAAGCCGCATCAATGTTTGTCGCTGCAATTAACATGTGTGCAGTCGATCC
594 Euc_005887_O_2 TTCAGCGCATTGTGTAAATGTAGATAGGTGATATATTTCTCGTTGCAATGTAGGGTAAGA
595 Euc_005981_O_2 TCCAATAATCACATTTACCATCAACAGGCATCAGCAACATACTGTTGTAGTGTAATTAAT
596 Euc_006766_O_1 GGGCATTCTGACTACCTGCACTGTATAGCTGCACGGAACTCTTCTAGTCAGATTATAACA
597 Euc_006769_O_1 AATCGTCTGGTAGATTGTCAAAAACTAATAAACCTGTGATTGATCCGGATTCTAGTAATG
598 Euc_006907_O_2 AGTTGAGGATTCTCCACTATGACAGCTCTCATGGCTTGAATCTAAAGTCATCTGGTTTTC
599 Euc_007518_O_1 GAACAATCATTCTGTAGAACACTAGAGTCTATATGCTTGACTGTATCGGTTAATTAATTC
600 Euc_007717_O_1 AGATAGCGATAGAGTTATACTGCATGTACTGAGGTAAATGTTTTGATTACTCCACCCAAT
601 Euc_007718_O_1 AAGAATTGTTAGGAGGTGTATACTTTCTGTAACTGTATTCAATGAGCATACACCTGACGG
602 Euc_007741_O_2 CAACTCATATAATGACTGGATTCTGGCAACCGCGTCTTCAGACACAACAGTTGGACTATT
603 Euc_007884_O_1 AGTGTAAAAGGATGCCCCTAATAGATTATATGCCAAGTGTAGTATATATAATAGTGCTTT
604 Euc_008258_O_2 AAGAATCTACAGTTGTCTTATGCTACTCTATTACTCAATTATGCTGTGCTATTGATTGAG
605 Euc_008465_O_4 TCTGAATACATACTTTGTGGTCTCTATAAAAGACCAATGATACAGGCATGGTCATTAATT
606 Euc_008616_O_5 TAAATCTTCTCATGTGCCTGGCGTAAATTTTGCAGTTATTACTAGACCAAGATAGTTTCA
607 Euc_008690_O_4 ACATGGATTCGATCAATCGCCACATGACAACTAAAACAAGCGGTTCACGTGATTGTAATT
608 Euc_008708_O_4 AGATGAGTATGCTCGGGTGTATGATATTCGCAATTACAAGTGGAATGGATCGCATAATTT
609 Euc_008850_O_5 TCTTTGATTCTGTTGTATGGTGTATCTTATTGTATCTTCTATCTGCCCCCCATGTAATTC
610 Euc_009072_O_1 TTCGTTGTGTAGTACTGGGAGTTACTACTTGTATGTATGTAAATCATGTGGCGTCTGTCC
611 Euc_009465_O_1 GGAGATGTGTAATATGTCTGAGCGGTCACACTCTAGCTGTTACATGCGTAAAGTGGGGAG
612 Euc_009472_O_3 CCACCGTTGCGTAACTCGAATAGCCGGATTTTCGTTTTCGTTTTTATTTCCCCGTTAATT
613 Euc_009550_O_1 TGAGATGCTCTGTGTGAGGACTTTTACGAAACTTGAATGGCCCGTAAGGACAATAAGCTT
614 Euc_010284_O_3 TGGGTTGTTGCGACGGGTTCTACAGATAAGACTGTTAAGTTATTTGATCTACGCAAGATC
615 Euc_010595_O_1 GCAGAGGTGCCTACATATGCTTTAGAATGCTAGTAGCTTGGAAGTGCAACACGCTCGTGA
616 Euc_010657_O_1 AGTAAAGTTTAACGACTATGCATCTGTCGTAGTATCAGCCGGCTATGATCGTTCAGTGCG
617 Euc_012636_O_2 CGTTAGGATAGTCTTTAAAGGAGTTGGTGATTATTGATTTCCACCCAATATATGTAGCGT
618 Euc_012748_O_2 GAGCAAGCTACTTACAAAAATCGACAGCGTCTTTACCTATCTGAACAGACAGATGGCAGT
619 Euc_012879_O_2 TCCTTCCGACAAGTACCGTATTGCAAGTTGTGGTATGGACAATACGGTTAAAATCTGGTC
620 Euc_015515_O_1 TTTCACTCGATGACGGTTGGCCGGATAAATAATCGCTTATATAGTCCTAATAAGTTCCAT
621 Euc_015724_O_3 ATATGTAGGTGGTAGAGGTGTGGATATTGCATAGACCGAACCTCCGCAGGTCCGCATTCT
622 Euc_016167_O_1 CCATTGAACTACTTATGGATTACTTTATACATGAAATATCATGCCGGAGTAATTTTGAGT
623 Euc_016633_O_3 AGCATTAGAGACCTGGATTTTAGTCTAGATTCAGAGTTTTTGGCTACGACATCTACTGAT
624 Euc_017485_O_3 AAAGGTTTATCCCTCATTGGATTTGATATATAAACTGAGAGTGTTTTGCCCCCCATTAAA
625 Euc_018007_O_1 GTACAGCGTGTATTTCTTGTTACGATACTTGAGGGGTTAGAGGCACCTACGAATTAGGAA
626 Euc_020775_O_3 ATATCCTTATGAATGAAGTTTGGATGATAAGTGGCGCCAGACTTTCTACTCACCCTTTTT
627 Euc_023132_O_3 TGATCACATCGTTGTTTGCAATAAGACGTCATCAATTTATATCATGACTCTACAGGGACA
628 Euc_023569_O_2 TTTTCCCAGTGTACTGCGAGAGTGATGCTACATAAGTTTACTCTTGTGTCTAACTTTTCC
629 Euc_023611_O_1 AGATTCTACAGATGGCGCTATACGAGCTGTTATACGGACATTTTATGACCATACACATCC
630 Euc_024934_O_3 TGCTACGGGAAACCAGGACAAAACTTGTAGGATTTGGGACATACGAAACTTATCTAAGTC
631 Euc_025546_O_1 CAAGTCATATAGTTACAGTGTCGCATGACAGAACAATTAAGCTCTGGACTAGTAACGACG
632 Euc_030134_O_2 TGCCACATCGTAACCATCATAGCACTTATCATCTAATTATGGTGAAAGGGAGTTATATAT
633 Euc_031787_O_5 GTTTATACTTATAAACAACAGAGAGACAACTGTACAGGTGTTGTAAACACTCCCAGTGTG
634 Euc_034435_O_1 CTGTGTTTTAGCCCGAGGGCCAATCACTTAGTTGCTACTTCGTGGGATAATCAGGTACGG
635 Euc_034452_O_3 GCAAAGTAGAGTTTAAGTTTCGTTGTGCTTGGACCGGAAAACTCACATGCTTAGAGTTTA
636 Euc_035789_O_5 AAGATTTGGGCATAACTTGTATGAACTTTTTCTGTTGTCGACACTGTAATTACACGAGCT
637 Euc_035804_O_4 AAACAGATGCATGTATGCTTCATAACTCTATAGATATGGAAATGTCACTGTACACTGATC
638 Euc_043057_O_2 TTATTGGTGCACAGGACGGAAAATTGCGCATATATTCTATTTCAGGTGATACATTAACAG
639 Euc_046741_O_1 AGGCACAGACACTTGCCTAAACCAATATACAAGGCAGGTATTCTAAGGCGCACCGTGAAT
640 Euc_047161_O_4 CATGCGAAGGTTTCTGGGAATTTTCAGTAGAAAATTCGGTCGTGGCGGCCATCCTCGATA
641 Pra_001766_O_1 TTAAGCTGATAGCTTTAGTTCCTACGTGGAATGTATAAATGCACCATTGTCCATAAGGCA
642 Pra_002927_O_2 GGATGCTCTGGTTACATGACTACTCCTTAGGGAATCAGTCAGACATTTTAAATAACTTCC
643 Pra_007642_O_2 TCATTAAGCGGTACTGGCAGAGGACATGTCTATTTATACAAGCAAATGGTCCTATTGGCT
644 Pra_013714_O_1 ATGTTGGTCAGACCTCAAATATTGTACTCCCCACACTAGGGAGCATTTACGGTGAATATA
645 Pra_016332_O_1 TCCTCTCGACCCTTAGAGTCCTCTGCGAATCTTGTTGTTAGTTACTGTGTACGCTGTAAC
646 Pra_021677_O_3 AAGCATGTTTTGAATTTATGGTGGTGGCATGTGGATATTTGAACTTGGTTGAGAAAAATT
647 Pra_027562_O_2 CATTCCTATTGAAGGGTCAACCTTTAATTTTGGCTAGCAGGACTGTATAGGATTATATGC
648 Pra_001504_O_2 TTATTGTATTTTAGATTCTTGATGGCCATCTAAACTTCTGGCTGCTTGGTGCAACATTGA
649 Pra_015211_O_2 ATAGCTAATGATTCCATGCTATCCATGGTATCTACTTCACGATAATAAAGGTCTTAGTCC
650 Pra_020421_O_2 CACCTAATAGGCCTGAGTATTGCTCACCACTATGCTGATATGGGGAGCAATAACGTTAGT
651 Pra_003187_O_2 TTTCTTTTCACTTTGTACTAATGATCATTGTGACCACAAAATCTTTATACACAATACAGA
652 Pra_015661_O_1 CTTGTCACTATCCTCATATTGATATCACCTCGTGTATGTTGTGGGGTGGCAAAATTACTT
653 Pra_013874_O_1 TATTTTAACTCAGCGACTTACCAGCCTAGTAAGCAATGGGGAGCTTGCATGTATTAGTTT
654 Pra_014615_O_1 ATTCGTCCTGGTCCTTTAGGACATGTACTTATGTCCATGCAAGTGCTTCTTGCCTAAGCT
655 Pra_004578_O_2 TTCTAGGCGATATATATCGCCGTAACTTTGGATGTGTTAAGAATATAGGGGATCATTAGC
656 Pra_023387_O_3 AGTTGCAGAGTGTGTAGCAACTGATGAGCATAGTTGTTATGTTTCTCAACTCAGTTGCAC
657 Pra_006970_O_1 AAGAAACTCATACACTGGACAGGCCAACCTTCCAAATATGTGTTTAGAAAACCTTTGTCT
658 Pra_010322_O_1 AAGGGGTGCTATCCATATCTAGAATCTACCATGCTCAATGAGGTATCTTCATTAGTATAC
659 Pra_022721_O_1 ATCTAATGCTAGTTTATTGATTTCTATGATCCAAGACCTCGTCATAGATCAAGTGCCTAG
660 Pra_023407_O_1 TTGTTATTAAATACCATTCAATATGCTTATGATTCATGAATGCTTAAGAGATTCTGCTGC
661 Pra_001945_O_2 GCTTCTAAACTGTAGAAGCCTGTTATCTTTAGACTCGTGGTTATGTGAACTACTTTTACA
662 Pra_008233_O_1 GGCTGTGGGGATTCGAGCCTGATGGTTATGCACTGTGGCCAGCAAGATGTTGAAGTTTTA
663 Pra_008234_O_4 GCCTGATGGTTATGCACTGTAAGTGATCTGATTTGATTAACTATTTTATCAATTAATTTT
664 Pra_022054_O_2 ATGGTCATTATCCGAGATAGTGCGCTTTGTCATGGGAAAATGACTATTGAATGTGAGTTT
665 Pra_012137_O_2 TTTTCTGGTGCATCCTTAACACAGCTTGGTTACATGGTGAATTACAGTATTTGAAGGAGT
666 Pra_012582_O_2 AGATTTAATGCCACTTAGGTGATCGGTGACCCACTTGTACATATAGATGTTGGCGATGTT
667 Pra_015285_O_2 AAGAAATTCATCAATTCTTTGAAATTATTGTTCCCTTTTGATGCGGCCCCTTTCTGGAGG
668 Pra_017229_O_1 TAAAGTATATTTTAGCCGCTGTTGTTGTAAATTTATGTTTTTCATTGCTATCAACATTTA
669 Pra_020724_O_2 GGTTTTCCTATAAGATGTATGAATTCGCACTGTGGTGCAATTTTATGAATTAAACTCAAA
670 Pra_004555_O_1 TTTACTATTCCGTCTGGGCTTAGAGATGTACGTTAATTGGTCATTTAAGACGACTCAGTT
671 Pra_004556_O_5 TCAAATCTAGTCAATATCCGTGTTGAGCTAAACAAGCGCTGAAAGTTTGCTCGAATCAGC
672 Pra_005729_O_2 AGAAAGTTGTGTACTAATTTGTATTGTAACGTCCATTTATCCAACGAGTCCTCCATTCAT
673 Pra_007395_O_3 CAGTACTGTATTCGAAGATCCTGAAAATTTACTAAAACAAATGGAATATCAACAACCTAG
674 Pra_009503_O_1 TTGCTCTATATAATTTGTGCTCGTGTGTGTACTTGAAGATCCATCCTCACATAGTCCAAT
675 Pra_011283_O_1 GTGTGTATAGTTTTATAACACTCTATGGTATCACTACCACTATGGGCCTGTTTAGTCCAA
676 Pra_012322_O_3 GAAGCAGAATCAGCTTTGACCAGTATTTAGTGTCTTGTATACAATTCTTGTTTCAGTGAA
677 Pra_023236_O_3 AAATCAAGATTAAAATCCGAAACCAAGGCTAACCAGCAAACTGTGAGGTGTACATTGTTG
678 Pra_000171_O_2 TTCCAAGCAGAAGGGCACATGTTGTGACATCAAGTAGTAGATTGTTCTGCAGATTCTGGT
679 Pra_000172_O_1 GTTAATGTAATACATTTAGTTTTTAGATAACTGTTAATGTGTAGTAAAGCACTAGGAAGA
680 Pra_001480_O_3 GAGGCTTCAAAGGTTTTTGTGTCTTTTCTAGTTATTATAAACGCTTCATAGGTTCCTAGG
681 Pra_001692_O_2 GAAGATTGTAAGTTGGGTGAACTTTTTTACCACGCTAGGTTGATCTATTTTAAGACTCTT
682 Pra_005313_ORF_O1 AAAATAGCTGCGCGTACCACAAAGGTGACAAACGCCGGATTTCTCTTATCAGACTTGTCA
683 Pra_006362_O_1 TTTAATTATCATAGTTTTATTCCGGCTATCTTGATCATTCACGGAAGTCCCGAGAGTCAA
684 Pra_006493_O_3 GTGGAGTGAACGTGGTTACTTCAATGGATTACCCTTCTATCGTGTCATTAAACACTTTGT
685 Pra_006983_O_1 GCTAACTCTTCTAGTTGAGATCTCCATCAATTAATGGATACAAACATTGAGTTTCACTTT
686 Pra_007665_O_1 GGATCACTACTGGATTCCGTTACATTAGTTATTGCAAGTTGGTTATTATGTACGTTTATA
687 Pra_012196_O_1 ATGAACAAATGCAATTACCCTGTTTTATTCTATCCCGCTTTAATTAATATTGGTCATGTT
688 Pra_013382_O_1 TTTGCTTGTGGATTGTACTGTGGTACATGGTATAAATCTATAGGCTATGTCGATTATTTT
689 Pra_016461_O_1 ATATAAGATATAAGATATTGCCAGCAAACTATTTGACAGGTTATTTAATAAAGTGTGCTA
690 Pra_017611_O_1 TTTTAAATGTGGACAGAGGCACTATAAGAATGCGAAATATCGTCGGAGCACGACTAATTG
691 Pra_019776_O_1 ATAGACTAGTTCTACAAAGCCCTAGGATGATGGACTTCATTTCTTTTGCATTAAGATGAA
692 Pra_020659_O_1 GATTTCTTATGGGGTTGGAACATTCCTCGCTGCCTTCTGGTAATATTAGGTTATGCGTTT
693 Pra_022559_O_3 AATTGAGGTTGACTGTGTACTTCTCCAGTGGACAGGAGAAAGCGATAAAATTCAAACGTT
694 Pra_024188_O_5 AAGGAAGGGCAAATAGAGCTCGCGCTCAAGAAATACCTTAAATCGATACGGTATTTGGAT
695 Pra_027973_O_2 TAATTTAAGAGCTATGAAACAACTACCTTTTGGAATGGTTTTGTTTTTAGCATCCCAATT
696 Pra_001353_O_1 TTGTAAATTATGCTGGTTCCATATGGGGGTTAATCAGTATCCTGGTTATTTGTGACACCA
697 Pra_001978_O_3 GTTGTGAACTATCAATAGACGGGGATGGTCCTTTTTAGCTGCTCCTTAAGCAGCTCAAAT
698 Pra_002810_O_2 TCAATTCCGGTCATATGTAGACGACTATAATGTTGTTTGTGTCCTATAACTATAGTGTTG
699 Pra_002811_O_1 CATTTTACACCCTATAACAAAATATAGTGTCATAAGTTTACACCAGGTAACAACTCTATA
700 Pra_002812_O_3 ATGGAGAGTTTTATTCATTACATGAAAGAGTATGTCACCTTTCGTGCTCCATCTATTGAT
701 Pra_003514_O_1 TTTCACGTCCTGTATACTCACTCAAGCAACTTTAGGATGAAGAGCTAAAGTATATCAAAG
702 Pra_004104_O_2 AATGCACTCTTTATAAAGTGGGATGAGGTATGTGTTTCCTTCCTATTGGCTAACCTGAAT
703 Pra_005595_O_1 ATTGGGCAATCGTTATTGATTTTACCTATCGCTATCTCACTGTCCGCCAATTTAGTGTAA
704 Pra_005754_O_1 TTTCAGCGGATATAAAGTCTTCCAACTTGTAAACCGGTGCTGTGAAGATTAAAAGTCCTT
705 Pra_006463_O_1 GCTTTAGAGGCAATGGTAGATTATGAAGTCAACACCAGGGAGTTTGACCGTTTGGGACAT
706 Pra_006665_O_1 CATTCAATTTGACATTGGAGTTTCAAGGCATTCCAAGGATAGCATGTACACAAGTTGAAT
707 Pra_006750_O_1 CATAAAATTACTATGGAAGTTGGATCATTATCTATGCCATAGTGGAGTAGAACTAGATTT
708 Pra_007030_O_1 CTCTTGATTCTAGAATCTAAACTACTACCTTGCGGACATGACTGAGCATCTCTCTAACAG
709 Pra_007854_O_1 CAGGGTTGTGCTAGTTTAACATTTTAACTTAATGTAATCATGTAAGCTTTAGAGAGGTGG
710 Pra_007917_O_1 GTAAATGTTTACATTGAGGTCATGCATGAGTGTTAATTACGCTTTCACTACTGTTCACTT
711 Pra_007989_ORF_O2 AATTAAAGCTTGGTTGTATGATCATTTGGGATCGAGAGTAGATTATGATGCTCCTGGGCA
712 Pra_008506_O_1 TTATCTAGCTAGAAGTTGTGAAATTAAGAGGGATGTGAGGATTGGGTTATAACTAGTGTA
713 Pra_008692_ORF_O2 AATGAATCAGGCATTAAAGCGGGAATCATTTATGACTTGGCAACCTGAAAATTCTATTAA
714 Pra_008693_O_2 TTCTTGACGTTTTAATATGGTATGGTATTAAATTTGGAAGGCCTATTCGATTGTTTGCAA
715 Pra_009170_O_1 TTCTTATAACCTGTACGATTGCCGATATATCACCAATTTTGCTGATTTTAATCTGAGTTT
716 Pra_009408_O_1 CAATTTCATATTCGGGTTCAATGTAGTGCCTCTCATTTTAGGGTGATAGCATGAGTTTTT
717 Pra_009522_O_1 TCCACAAGTTAACATAGGTAACTATCGACTGAAGTGAACTGGGGGGCAGAAGCTAACTAT
718 Pra_009734_O_2 TTTAGATAGCCATTTACATTTTACTTATTATTGGACTTGTAAAGATTTTTGTACCCTTGT
719 Pra_009815_O_4 TTGCTGAAATATTTCAAGCTGAAAGTTATGATTCTGGCCAAGAAGTCTACTGAAAATTTG
720 Pra_010670_O_2 AAACATAAGTTTGGCCCAGATTCGGTTTATCATAAAATCTGGCTGCATATAAGGTGTCAG
721 Pra_011297_O_1 ATGTTCTAGAATTTGTCTAAGCTAGCTACTGGTGTTTAACTGATATGGAAAACTTTTGCC
722 Pra_013098_O_2 TTTGGGGAGTACTTTAGTCAATAAAAGTGAAGTGAATCATGATATAAAGGGTTTAAGTAA
723 Pra_013172_O_2 AGAAGTTACTAATTTGTAGATAAATTCTAACGAAGGTGATGATAGCATACACGTAATGAA
724 Pra_013589_O_2 GAATTTTGATGGTAGCGTATGGTTGAAGGAAAACTTGGATATATCATGTAAACATTTTTC
725 Pra_013608_O_1 TTAATGAACCGCTTTTTCCTTGAGAGGCTATGAATGCCTGTAGAACTAATCCTTTAAGTA
726 Pra_014299_O_2 TTTCTCTAACACTATATTTTCTGGTATGACCGCTCTACATTGTATATTAACCCTTGCAAA
727 Pra_014498_O_1 TATATTCACTGTGCTGGGATTATCCTCTCCCCTTTTTGACCCACTGTTGTGTGTATTTGA
728 Pra_014548_O_1 GAGCATACAGCGTTATCTTTGAGACGAGTCATCAATGATAATATCCTCGTAAAAGGTTAC
729 Pra_014610_O_2 TTTATTCAATTACGACGGATTCAGTTGGCCTTTTGTAACATTCAAGTATCCATCTATCAC
730 Pra_016090_O_2 ATGTTCAGGGGTATTAAAAATTCAGAGGATAAATTTCCTCACTCTCAAGTGTTAGATGGT
731 Pra_016722_O_2 CAAAGTCTAGACGTTAATGTTTTGGAACTCTTTTTTCGAATTTGTGCCTATTGAATCACT
732 Pra_016785_O_3 TATAAATATATTGTACTGGGGATCCAAGACATGGCAATATATGTCGAGATTTTCATTTTC
733 Pra_017094_O_3 CTTTTGCATGAGTTCAAATGTCTTTGTGACATATTGTCTTGAACCACCGAGGATATATCA
734 Pra_017527_O_2 GTTTGTATGTCCAATAGATTATAACCTATTTACTGTGACACTATTCTTCACACCCATGTC
735 Pra_017591_ORF_O2 AGATCTAGTTGTTTCAGCATCGTTGGACCAAACTGTTCGTGTATGGGATATAAGTGGCCT
736 Pra_017769_O_2 TGCCGTATCAAAAGATTGGTACTTCCTTATGGACACACAAGATCGTAAGCATGGCTGAAT
737 Pra_018047_O_2 TTGATGGCCACATGAGTTGTTTATACAAGTCGTTGTTTTATGAGAGAACCTTCTTCAGAT
738 Pra_018414_O_1 ATTTCTATAGTGCCATATGCTTGTCGGTTGTCATTGACCTCTAATAGAATAGCCAGAGTA
739 Pra_018986_O_1 TTCACGGCAGTTGAACTAGTCATAGTGGAATATTATTTAAATGGTGTATTCTAGTCACAT
740 Pra_019479_ORF_O1 TGCAGGCGCTCTATAGTTCTGTTCTCTAGCATGAAGTGTGTATTTTATCTATTGTGGACC
741 Pra_020144_O_1 TGTCTTTAATCTTCAGGGTTCGTTACTAACAATTGAGCTCAAATCTCTATTCTGACCAGC
742 Pra_022480_O_1 CATTTATAGAGTTGTGCAAAATCACCCATAATGCTATGAATTGACAGGTGACTGTAATCT
743 Pra_023079_O_2 GGAGAAAATTTCCTATCCCTTTGTGGGTGTGTGAAAAACGAAATATAGAGGAACAATGTG
744 Pra_026739_O_2 ACCAATCATTTATTTGCAGTGTAGTTGATATGAAGGGAGAAATATGACAGTTGGTTTCAA
745 Pra_026951_O_2 AAGTTAATGTTCTCATAGGTTATTCATTGGAGTTGTCTCGTATGTACGCTGTGCCGTAGT
746 Pra_026529_O_2 CTCATAAATTGAGGCTTGCCTACGTTAATTGTTATATATGGAGAGCCATGCTAATTGTTA
747 Euc_006366_O_2 GCAGATCATGTAATTGTATCTCAAATTATAGTATCCGTATTCTGTACAAATGCTCCGGAA
748 Euc_017378_O_1 TCTTTACGCAGATGGTGACTGAAGCTGGTTCCGAGATCGGCATATGTAGCTGGTAGAGGT
749 Pra_000888_O_1 TTCACATTGAGGGTTGCCGTCGGTATTCGCCGATGATATCCTGTTTTACGCGCAACAGTT
750 Pra_014166_O_1 TCATTATTTAGGGTGCAGGCTGTATAAAATGTTGTAAATTGTAGTATCAATGTGTACAAT
751 Pra_003189_O_1 GCATTCACCACGACAGTAAAGTAATCATTATGATTACTAATGTATTGCTTTCATGGGGTG
752 Pra_009356_O_4 AAAGGGTATATTTTGTCTCATGTTGGGGTGATAATTCTCCCTGAAAGTCTCCAAAATATA
753 Pra_000065_ORF_O_2 AAATTTCCGGTTGCCATAGTCTAGTGGGGTGAGGGTTCATTCTAGGGGATTTATTGTGTT
754 Pra_014197_ORF_O1 GCAGTGATAAAGGTACTTCTTGGTGATAATCCTAAAGCCTTACCCATGGATATCCAGCCT
755 Pra_009081_O_2 TTCTTTAACAAGGTAAAAATCCCCCCCTTGGCATGTAGCTCAATTAGTTGTAATGGAACT
756 Pra_013417_O_1 AGTTGTAAACAGTGTAATAAGGAGCAGAAGTTGTGATAGCTTTTAGGAACGATAGACTTT
757 Pra_005755_O_1 TGAACCAATTCTTGTATATTAGATATGTAACATGTATGAATGTCCATAGAGCAGAGCTTT
758 Pra_006670_O_2 AGCCAGGCACGCTTAACTAAATTTCGTTTAGTTCACCATGACTATTCGTTGAACTTAATG
759 Pra_007027_O_1 CAAAACCCCTTGTAGGGTGGACTTCTGTTGTATCCAATTTTTATGGCATAATTAGCTAGT
760 Pra_007276_O_1 AATTTGGTGATTATTCCTTACCATATCGTACTGTACAGATACGGTAAGGTCGAAATATAT
761 Pra_007390_ORF_O1 CATGCCGTGATCGGTCGATTGCATTAAGTGCTGCAAGGATCAAATAGTGGCACTGTCATG
762 Pra_012648_ORF_O1 CAAACATAAATAAGGTTGCTACTTTAAAGGGACATACGGAACGAGTTACTGATGTGGCAT
763 Pra_013171_O_2 ATTTATGGATGAGGTACTCCTTATGAATATCTTCAAACTAAGAAATAACTATATATGCAA
764 Euc_045414_O_2 CTTGGTTTTTGTTGAGCTTTCTATTTCAAGCAATTTGTGATTGGGGGGTTCTGCATTCTT
765 Euc_044328_O_2 ATGTCTAAAGAGCCGTGATCTATGAGTAGATTAGAAACCGCCTTTTTAGTTGCAAACGCC
766 Euc_015615_O_2 TTGCAACAAGGTATACTTAGTCAGTCCTTGTTATGTATGTCTTTTGTCAACCCTTCAGGG
767 Euc_017239_O_3 GGCGGAATCCCTTTGTTCTTTCGAGCTTTACGTGACAAGTCGGCCAGAAAGCAGTAGCAT
768 Euc_018643_O_3 TTGATGTACGAGCCGCTATATCTAATTCTGCCTCCCAGTCACTGCCAAGTTTTACTCTTC
769 Euc_019127_O_5 GTCTTGCATGTCAGCTATTATACAGTCCTGTTTATAGTCCTGTGATGTAATAAAAAGCTG
770 Euc_022624_O_3 AAGTAGGAGATCGTGTAGAGAGAATACTTTCTGCTCTCAGCGGCGAAGAGGTTTGTCTGC
771 Euc_032424_O_1 AATTGTGAGTAGAATAGGAGAAACTTTTGTACAAGATTAATACGTGTGGCATAATAAGAT
772 Euc_037472_O_1 TGATGTGCAGTTTACATTATTATGGTTCGAGTATTATTTAGCTGCCCTATCTTAAGTCAT

TABLE 15
Peptide Table.
Patent Patent
Protein ORF ORF
SEQ ID Target Patent PEPTIDE Sequence start stop
261 CDK type A MGDGSLGSGGRGNSGGGGGGGSRPEWLQQYDLIGKIGEGTYGLVFLARIKHPST 387 1820
NRGKYIAIKKFKQSKDGDGVSPTAIREIMLLREISHENVVKLVNVHINPVDMSL
YLAFDYADHDLYEIIRHHRDKVNQAINPYTVKSLLWQLLNGLNYLHSNWIIHRD
LKPSNILVMGEGEEQGVVKIADFGLARVYQAPLKPLSDNGVVVTIWYRAPELLL
GAKHYTSAVDMWAVGCIFAELLTLKPLFQGQEVKANPNPFQLDQLDKIFKVLGH
PTQEKWPMLVNLPHWQSDVQHIQRHKYDDNALGNVVRLSSKNATFDLLSKMLEY
DPQKRITAAQALEHEYFRMEPLPGRNALVPSSPGDKVNYPTRPVDTTTDIEGTT
SLQPSQSASSGNAVPGNMPGPHVVTNRPMPRPMHMVGMQRVPASGMAGYNLNPS
GMGGGMNPSGIPMQRGVANQAQQSRRKDPGMGMGGYPPQQKQRRF
262 CDK type A MEKYQQLAKIGEGTYGIVYKAKDKKSGELLALKKIRLEAEDEGIPSTAIREISL 99 1007
LKQLQHPNIVRLYDVVHTEKKLTLVFEFLDQDLKKYLDACGDNGLEPYTVKSFL
YQLLQGIAFCHEHRVLHRDLKPQNLLINMEGELKLADFGLARAFGIPVRNYTHE
VVTLWYRAPDVLMGSRKYSTQVDIWSVGCIFAEMVNGRPLFPGSSEQDQLLRIF
KTLGTPSLKTWPGMAELPDFKDNFPKYVVQSFKKICPKKLDKTGLDLLSRMLQY
DPAKRISAEQAMGHPYFKDLKLRKPKAAGPGP
263 CDK type A MDQYEKIEKIGEGTYGVVYKAIDRSTNKTIALKKIRLEQEDEGVPSTAIREISL 120 1004
LKEMQHGNIVKLQDVVHSERRLYLVFEYLDLDLKKHMDSCPEFSKDTHTIKMFL
YQILRGISYCHSHRVLHRDLKPQNLLLDRRTNSLKLADFGLARAFGIPVRTFTH
EVVTLWYRAPEILLGSRHYSTPVDVWSVGCIFAEMVNRRPLFPGDSEIDELFKI
FRIMGTPNEDSWPGVTSLPDFKSTFPKWASQDLKTVTPTVDPAGIDLLSKMLCM
DPRRRITAKVALEHEYFKDVGVIP
264 CDK type A MVMKSKLDKYEKLEKLGEGTYGVVYKAQDKTTKEIYALKKIRLESEDEGIPSTA 23 937
IREIALLKELQHPNVVRIHDVIHTNKKLILVFEFVDYDLKKFLHNFDKGIDPKI
VKSLLYQLVRGVAHCHQQKVLHRDLKPQNLLVSQEGILKLGDFGLARAFGIPVK
NYTNEVVTLWYRAPDILLGSKNYSTSVDIWSIGCIFVEMLNQKPLFPGSSEQDQ
LKKIFKIMGTPDATKWPGIAELPDWKPENFEKYPGEPLNKVCPKMDPDGLDLLD
KMLKCNPSERIAAKNAMSHPYFKDIPDNLKKLYN
265 CDK type A MDQYEKVEKIGEGTYGVVYKAIDRLTNETIALKKIRLEQEDEGVPSTAIREISL 149 1033
LKEMQHGNIVRLQDVVHSENRLYLVFEYLDLDLKKHMDSSPDFAKDPRLVKIFL
YQILRGIAYCHSHRVLHRDLKPQNLLIDRRTNALKLADFGLARAFGIPVRTFTH
EVVTLWYRAPEILLGSRHYSTPVDVWSVGCIFAEMVNQRPLFPGDSEIDELFKI
FRILGTPNEDTWPGVTALPDFKSAFPKWPAKNLQDMVPGLNSAGIDLLSKMLCL
DPSKRITARSALEHEYFKDIGFVP
266 CDK type B-1 MEKYEKLEKVGEGTYGKVYKAKDKATGQLVALKKTRLEMDEEGVPPTALREVSL 199 1116
LQLLSQSLYVVRLLSVEHVDGGSKRKPMLYLVFEYLDTDLKKFIDSHRKGPNPR
PVPAATVQNFLYQLLKGVAHCHSHGVLHRDLKPQNLLVDKEKGILKIADLGLGR
AFTVPLKSYTHEVVTLWYRAPEVLLGSAHYSIGVDMWSVGCIFAEMVRRQALFP
GDSEFQQLLHIFRLLGTPTEKQWPGVTTLRDWHVYPQWEPQNLARAVPSLGPDG
VDLLSKMLKYDPAERISAKAALDHPFFDSLDKSQF
267 CDK type B-2 MERPATAAVSAMEAFEKLEKVGEGTYGKVYRAREKATGKIVALKKTRLHEDEEG 41 982
VPPTTLREISILRMLSRDPHIVRLMDVKQGQNKEGKTVLYLVFEYMETDLKKYI
RGFRSSGESIPVNIVKSLMYQLCKGVAFCHGHGVLHRDLKPHNLLMDKKTLTLK
IADLGLARAFTVPIKKYTHEILTLWYRAPEVLLGATHYSTAVDMWSVGCIFAEL
VTKQALFPGDSELQQLLHIFRLLGTPNEKMWPGVSSLMNWHEYPQWKPQSLSTA
VPNLDKDGLDLLSQMLHYEPSRRISAKAAMEHPYFDDVNKTCL
268 CDK type C MGCVLGREVSSGIVTESKGRDSSEVETSKRDDSVAAKVEGEGKAEEVRTEETQK 291 2042
KEKVEDDQQSREQRRRSKPSTKLGNLPKHIRGEQVAAGWPSWLSDICGEALNGW
IPRRANTFEKIDKIGQGTYSNVYKAKDLLTGKIVALKKVRFDNLEPESVRFMAR
EILILRHLDHPNVVKLEGLVTSRMSCSLYLVFEYMEHDLAGLAASPAIKFTEPQ
VKCYMHQLLSGLEHCHNRRVLHRDIKGSNLLIDNGGVLKIGDFGLASFYDPDHK
HRMTSRVVTLWYRPPELLLGANDYGVGIDLWSAGCILAELLAGKPIMPGRTEVE
QLHKIYKLCGSPSEEYWKKYKLPNATLFKPREPYRRCIRETFKDFPPSSLPLIE
TLLAIDPAERGTATDALQSEFFRTEPYACEPSSLPQYPPSKEMDAKKRDDEARR
LRAASKGQADGSKKERTRDRRVRAVPAPEANAELQHNIDRRRLISHANAKSKSE
KFPPPHQDGALGFPLGASHRFDPAVVPPDVPFTSTSFTSSKEHDQTWSGPLVDP
PGAPRRKKHSAGGQRESSKLSMGTNKGRRADSHLKAYESKSIA
269 CDK type C MYSKSSAVDDSRESPKDRVSSSRRLSEVKTSRLDSSRRENGFRARDKVGDVSVM 107 2236
LIDKKVNGSARFCDDQIEKKSDRLQKQRRERAEAAAAADHPGAGRVPKAVEGEQ
VAAGWPVWLSAVAGEAIKGWLPRRADTFEKLDKIGQGTYSSVYKARDVTNNKIV
ALKRVRFDNLDTESVKFMAREIHILRMLDHPNVIKLEGLITSRMSCSLYLVFEY
MEHDLTGLASRPDVKFSEPQIKCYMKQLLSGLDHCHKHGVLHRDIKGSNLLIDN
NGILKIADFGLASVFDPHQTAPLTSRVVTLWYRPPELLLGASRYGVEVDLWSTG
CILGELYTGKPILPGKTEVEQLHKIFKLCGSPSDDYWRRLHLPHAAVFKPPQPY
RRCVAEIFKELPPVALGLLETLISVDPSQRGTAAFALRSEFFTASPLPCDPSSL
PKYPPSKEIDMKLREEEARRRGAAGGKNELEKRGTKDSRTNSAYYPNAGQLQVK
QCHSNANGRSEIFGPYQEKTVSGFLVAPPKQARVSKETRKDYAEQPDRASFSGP
LVPGPGFSKAGKELGHSITVSRNTNLSTLSSLVTSRTGDNKQKSGPLVSESANQ
ASRYSGPIREMEPARKQDRRSHVRTNIDYRSREDGNSSTKEPALYGRGSAGNKI
YVSGPLLVSSNNVDQMLKEHDRRIQEHARRARFDKARVGNNHPQAAVDSKLVSV
HDAG
270 CDK type C MGCIPTIISDGRRRSAAPDKRRPRPRRSSSEGEAPPHATAAGSEGGESARGAPG 82 1749
KERPEPAPRFVVRSPQGWPPWLVAAVGHAIGEFVPRCADSFRKLAKIGEGTYSN
VYKARDLVTGKTVALKKVRFDNLEAESIKFMAREILVLTRLNHPNVIKLEGPVT
SRMSSGLYLAFEYMEHDLSGIAARQNGKFTEPQVKCFMRQLLSGLEHCHNHDVL
HRDIKCSNLLIDNEGNLKIADFGLATFYDPERKQVMTNRVVTLWYRAPELLLGA
TSYGIGIDLWSAGCILAELLYGKPIMPGRTEVEQLHKIFKLCGSPSEAYWNKFK
LPNANIFKPPQPYARCIAETFKDFPPSALPLLETLLSIDPDERGTATTALNSEF
FAAEPHACEPSSLPKYPPSKEMDLKLIKEKTRRDSSKRPSAIHGSRRDGIHDRA
GRVIPAPEATAENQATLHRPRAMKKANPMSRSEKFPPAHMDGVVGSSANAWLSG
PASNAAPDSRRHRSLNQNPSSSVGKASTGSSTTQETLKVAPELLQVGSSSLHPC
HRMLVYGSNLTIRSK
271 CDK type C MGCICAKQADRGPASPGSGILTGAGTGTGTRSSKIPSGLFEFEKSGVKEHGGRS 151 1560
GELRKLEEKGSLSKRLRLELGFSHRYVEAEQAAAGWPSWLTAVAGDAIQGLVPL
KADSFEKLEKIGQGTYSSVFRARELANGRMVALKKVRFDNFQPESIQFMAREIS
ILRRLDHPNIMKLEGIITSRMSNSIYLVFEYMEHDLYGLISSPQVKFSDAQVKC
YMKQLLSGIEHCHQHGVIHRDVKSSNILVNNEGILRIGDFGLANILNPKDRQQL
TSHVVTLWYRPPELLMGSTSYGVTVDLWSVGCVFAELMFRKPILRGRTEVEQLH
KIFKLCGSPPDGYWKMCKVPQATMFRPRHAYECTLRERCKGIATSAMKLMETFL
SIEPHKRGTASSALISEYFRTVPYACDPSSLPKYPPNKEIDAKHREEARRKKAR
SRVREAEVGKRPTRIHRASQEQGFSSNIAPKEKRSYA
272 CDK type C MAVAAPGHLNVNESPSWGSRSVDCFEKLEQIGEGTYGQVYMAKEKKTGEIVALK 82 1644
KIRMDNEREGFPITAIREIKILKKLHHENVIKLKEIVTSPGPEKDEQGRPEGNK
YKGGIYMVFEYMDHDLTGLADRPGMRFSVPQIKCYMRQLLTGLHYCHINQVLHR
DIKGSNLLIDNEGNLKLADFGLARSFSNDHNANLTNRVITLWYRPPELLLGATK
YGPAVDMWSVGCIFAELLHGKPIFPGKDEPEQLNKIFELCGAPDEINWPGVSKI
PWYNNFKPTRPMKRRLREVFRHFDRHALELLERMLTLDPSQRISAKDALDAEYF
WADPLPCDPKSLPKYESSHEFQTKKKRQQQRQHEETAKRQKLQHPPQHPRLPPV
QQSGQAHAQMRPGPNQLMHGSQPPVATGPPGHHYGKPRGPSGGAGRYPSSGNPG
GGYNHPSRGGQGGSGGYNSGPYPPQGRAPPYGSSGMPGAGPRGGGGNNYGVGPS
NYPQGGGGPYGGSGAGRGSNMMGGNRNQQYGWQQ
273 CDK type C MGCICTKGILPAHYRIKDGGLKLSKSSKRSVGSLRRDELAVSANGGGNDAADRL 626 2782
ISSPHEVENEVEDRKNVDFNEKLSKSLQRRATMDVASGGHTQAQLKVGKVGGFP
LGERGAQVVAGWPSWLTAVAGEAINGWVPRRADSFEKLEKIGQGTYSSVYRARD
LETNTIVALKKVRFANMDPESVRFMAREIIIMRKLDHPNVMKLEGLITSRVSGS
LYLVFEYMDHDLAGLAATPSIKLTESQIKCYMQQLLRGLEYCHSHGVLHRDIKG
SNLLVDNNGNLKIGDFGLATFFRTNQKQPLTSRVVTLWYRPPELLLGSSDYGAS
VDLWSSGCILAELFAGKPIMPGRTEVEQLHKIFKLCGSPSEEYWKKSKLPHATI
FKPQQPYKRCLLETFKDFPSSALGLLDVLLAVEPECRGTASSALQNEFFTSNPL
PSDPSSLPKYPSSKEFDARLRDEEARKHKATAGKARGLESIRKGSKESKVVPTS
NANADLKASIQKRQEQSNPRSTGEKPGGTTQNNFILSGQSAKPSLNGSTQIGNA
NEVEALIVPDRELDSPRGGAELRRQRSFMQRRASQLSRFSNSVAVGGDSHLDCS
REKGANTQWRDEGFVARCSHPDGGELAGKHDWSHHLLHRPISLFKKGGEHSRRD
SIASYSPKKGRIHYSGPLLPSGDNLDEMLKEHERQIQNAVRKARLDKVKTKREY
ADHGQTESLLCWANGR
274 CDK type D MDPDPSPDPDPPKSWSIHTRREIIARYEILERVGSGAYSDVYRGRRLSDGLAVA 13 1467
LKEVHDYQSAFREIEALQILRGSPHVVLLHEYFWREDEDAVLVLEFLRSDLAAV
IADASRRPRDGGGGGAAALRAGEVKRWMLQVLEGVDACHRNSIVHRDLKPGNLL
ISEEGVLKIADFGQARILLDDGNVAPDYEPESFEERSSEQADILQQPETMEADT
TCPEGQEQGAITREAYLREVDEFKAKNPRHEIDKETSIFDGDTSCLATCTTSDI
GEDPFKGSYVYGAEEAGEDAQGCLTSCVGTRWFRAPELLYGSTDYGLEVDLWSL
GCIFAELLTLEPLFPGISDIDQLSRIFNVLGNLSEEVWPGCTKLPDYRTISFCK
IENPIGLESCLPNCSSDEVSLVRRLLCYDPAARATPMELLQDKYFTEEPLPVPI
SALQVPQSKNSHDEDSAGGWYDYNDMDSDSDFEDFGPLKFTPTSTGFSIQFP
275 CDK type D MDPDPSPSPDPPKSWSIHTRREIIARYEILERVGSGAYSDVYRGRRLSDGLAVA 113 1558
LKEVHDYQSAFREIEALQILRGSPHVVLLHEYFWREDEDAVLVLEFLRSDLAAV
IADASRRPRGGGVAPLRAGEGKRWMLQVLEGVDACHRNSIVHRDLKPGNLLISE
EGVLKIADFGQARILLDDGNVAPDYEPESFEERSSEQADILQQPETMEADTTCP
EGQEQGAITREAYLREVDEFKAKNPRHEIDKETSIYDGDTSCLATCTTSDIGED
PFKGSYVYGAEEAGEDAQGSLTSCVGTRWFRAPELLYGSTDYGLEVDLWSLGCI
FAELLTLEPLFPGISDIDQLSRIFNVLGNLSEEVWPGCTKLPDYRTISFCKIEN
PIGLESCLPNCSSDEVSLVRRLLCYDPAARATPMELLQDKYFTEEPLPVPISAL
QVPQSKNSHDEDSAGGWYDYNDMDSDSDFEDFGPLKFTPTSTGFSIQFP
276 Cyclin A MSNQHRRSSFSSSTTSSLAKRHASSSSSSLENAGKAFAAAAVPSHLAKKRAPLG 187 1686
NLTNLKAGDGNSRSSSAPSTLVANATKLAKTRKGSSTSSSIMGLSGSALPRYAS
TKPSGVLPSVNPSIPRIEIAVDPMSCSMVVSPSRSDMQSVSLDESMSTCESFKS
PDVEYIDNEDVSAVDSIDRKTFSNLYISDAAAKTAVNICERDVLMEMETDEKIV
NVDDNYSDPQLCATIACDIYQHLRASEAKKRPSTDFMDRVQKDITASMRAILID
WLVEVAEEYRLVPDTLYLTVNYIDRYLSGNVMNRQRLQLLGVACMMIAAKYEEI
CAPQVEEFCYITDNTYFKEEVLQMESSVLNYLKFEMTAPTVKCFLRRFVRAAQG
VNEVPSLQLECMANYIAELSLLEYDMLCYAPSLVAASAIFLAKFVITPSKRPWD
PTLQHYTLYQPSDLGNCVKDLHRLCFNNHGSTLPAIREKYSQHKYKYVAKKYCP
PSIPPEFFHNLVY
277 Cyclin A MNKENAVGTKSEAPTIRITRSRSKALGTSTGMLPSSRPSFKQEQKRTVRANAKR 238 1653
SASDENKGTMVGNASKQHKKRTVLNDVTNIFCENSYSNCLNAAKAQTSRQGRKW
SMKKDRDVHQSGAVQIMQEDVQAQFVEESSKIKVAESMEITIPDKWAKRENSEH
SISMKDTVAESSRKPQEFICGEKSAALVQPSIVDIDSKLEDPQACTPYALDIYN
YKRSTELERRPSTIYMETLQKDVTPNMRGILVDWLVEVSEEYKLVPDTLYLTVN
LIDRSLSQKFIEKQRLQLLGVTCMLIASKYEEICPPRVEEFCFITDNTYTSLEV
LKMESRVLNLLHFQLSVPTVKTFLRRFVQAAQVSSEVPSVELEYLANYLAELTL
VEYSFLKELPSLMAASAVLLARWTLNQSDNPWNLTLEHYTKYKASELKAAVLAL
EDLQLNTSGSTLNAIREKYRQQKVNYSLLIHSKANHEIL
278 Cyclin B MAGSDENNPGVVGGAHVQEGLRVGAGKMGAGNVQQRRALSNINSNIIGAPPYPC 235 1539
AVNKRVLSEKNVNSENDLLNAAHRPITRQFAAQMAYKQQLRPEENKRTTQSVSN
PSKSEDCAILDVDDDKMADDFPVPMFVQHTEAMLEEIDRMEEVEMEDVAEEPVT
DIDSGDKENQLAVVEYIDDLYMFYQKAEASSCVPPNYMDRQQDINERMRGILID
WLIEVHYKFELMDETLYLTVNLIDRFLAVQPVVKKKLQLVGVTAMLLACKYEEV
SVPVVEDLILISDRAYSRKEVLEMERLMVNTLHFNMSVPTPYVFMRRFLKAAQS
DKKLELLSFFIIELSLVEYDMLKFPPSLLAASAIYTALSTITRTKQWSTTCEWH
TSYSEEQLLECARLMVTFHQRAGSGKLTGVHRKYSTSKFGHAARTEPANFLLDF
RL
279 Cyclin B MASRPIVPVQARGEAAIGGGAGKAAIGGGAGKQQKKNGAAEGRNRKALGDIGNL 158 1618
VTVRGIEGKVQPHRPITRSFCAQLLANAQAAAAAENNKKQAVVNVNGAPSILDV
PGAGKRAEPAAAAAAAVAKAAQKKVVKPKQKAEVIDLTSDSERAIEAKKKQQHH
EPTKKEGEKSSRRNMPTLTSVLTARSKAACGMTKKPKEKVVDIDAGDAHNELAA
FEYIEDIYTYYKEAENESLPRNYMSSQPEINEKMRAILVDWLIEIHNKFDLMPE
TLYLTINIIDRFLSVKAVPRRELQLLGMGALFTASKYEEIWAPEVNDLVCIADR
AYSHEQVLAMEKTILGKLEWTLTVPTHYVFLVRFIKASLGDRKLENMVYFLAEL
GVMNYATLTYCPSMVAASAVYAARCTLGLTPLWNDTLKLHTGFSESQLMDCARL
LVGYHAKAKENKLQVVYKKYSSSQREGVALIPPAKALLCEGGGLSSSSSLASSS
280 Cyclin B MGLPDENNAALSKPTNLQVGGLEIGGRKFGQEIRQTRRALSVINQNLVGDRAYP 205 1530
CHVVNKRGHSKRDAVCGKDQVDPVHRPLTRKFAAQTASTQQHCIEEAKKPRTAV
QERNEFGDCIFVDVEDCQPSSENQPVPMFLEIPESRLDDDMEEVEMEDIVEEEE
EEPIMDIDGRDKKNPLAVVDYIEDIYANYRRTENCSCVSANYMAQQADINEKMR
SILIDWLIEVHDKFDLMHETLFLTVNLIDRFLARQSVVRKKLQLVGLVAMLLAC
KYEEVSVPVVGDLILISDKAYTRKEVLEMESLMLNSLQFNMSVPTPYVFMRRFL
KAAESDKKLEVLSFFLIELSLVEYEMVKFPPSLLAAAAIFTAQCTLYGFKQWTK
TCEWHSNYTEDQLLECARMMVGFHQKAATGKLTGVHRKYGTSKFGYTSKCEPAN
FLLGEMKNP
281 Cyclin B MGLPDENNAALSKPTNLQVGGLEIGGRKFGQEIRQTRRALSVINQNLVGDRAYP 174 1499
CHVVNKRGHSKRDAVCGKDQVDPVHRPLTRKFAAQTASTQQHCIEEAKKPRTAV
QERNEFGDCIFVDVEDCQPSSENQPVPMFLEIPESRLDDDMEEVEMEDIVEEEE
EEPIMDIDGRDKKNPLAVVDYIEDIYANYRRTENCSCVSANYMAQQADINEKMR
SILIDWLIEVHDKFDLMHETLFLTVNLIDRFLARQSVVRKKLQLVGLVAMLLAC
KYEEVSVPVVGDLILISDKAYTRKEVLEMEKLMLNSLQFNMSVPTPYVFMRRFL
KAAESDKKLEVLSFFLIELSLVEYEMVKFPPSLLAAAAIFTAQCTLYGFKQWTK
TCEWHSNYTEDQLLECARMMVGFHQKAATGKLTGVHRKYGTSKFGYTSKCEAAN
FLLGEMKNP
282 Cyclin D MAMVQRQGHDPSSPQEQEDGPSSFLSDDALYCEEGRFEEDDGGGGGQVDGIPLF 94 1332
PSQPADRQQDSPWADEDGEEKEEEEAELQSLFSKERGARPELAKDDGGAVAARR
EAVEWMLMVRGVYGFSALTAVLAVDYLDRFLAGFRLQRDNRPWMTQLVAVACLA
LAAKVEETDVPLLVELQEVGDARYVFEAKTVQRMELLVLSTLGWEMHPVTPLSF
VHHVARRLGASPHHGEFTHWAFLRRCERLLVAAVSDARSLKHLPSVLAAAAMLR
VIEEVEPFRSSEYKAQLLSALHMSQEMVEDCCRFILGIAETAGDAVTSSLDSFL
KRKRRCGHLSPRSPSGVIDASFSCDDESNDSWATDPPSDPDDNDDLNPLPKKSR
SSSPSSSPSSVPDKVLDLPFMNRIFEGIVNGSPI
283 Cyclin D MEASYQPHHHGHLRQHDPSSSQQEEQVPFDALYCSEEHWGEEDEEEGLASDGLL 176 1342
SEERDHRLLSPRALLDQDLLWEDEELASLFSKEEPGGMRLNLENDPSLADARRE
AVEWIMRVHAHYAFSALTALLAVNYWDRFTCSFALQEDKPWMTQLSAVACLSLA
AKVEETQVPLLIDFQVEDSSPVFEAKNIQRMELLVLSSLEWKMNPVTPLSFLDY
MTRRLGLTGHLCWEFLRRCENVLLSVISDCRFTCYLPSVIAASTMLHVINGLKP
RLDVEDQTQLLGILAMGMDKIDACYKLIDDDHALRSQRYSHNKRKFGSVPGSPR
GVMELCFSSDGSNDSWSVAASVSSSPEPHSKKSRAGEEAEDRLLRGLEGEEDDP
ASADIFSFPH
284 Cyclin D MALQEEDTRRHYPTAPPFSPDGLYCEDETFGEDLADNACEYAGGGARDGLCEIK 150 1283
DPTLPPSLLGQDLFWEDGELASLVSRETGTHPCWDELISDGSVALARKDAVGWI
LRVHGHYGFRPLTAMLAVNYLDRFFLSRSYQRDRPWISQLVAVACLSVAAKVEE
TQVPILLDLQVANAKFVFESRTIQRMELLLMSTLDWRMNSVTPISFFDHILRRF
GLTTNLHRQFFWMCERLLLSVVADVRLASFLPSVVATAAMLYVNKEIEPCICSE
FLDQLLSLLKINEDRVNECYELILELSIDHPEILNYKHKRKRGSVPSSPSGVID
TSFSCDSSNDSWGVASSVSSSLEPRFKRSRFQDQQMGLPSVNVSSMGVLNSSY
285 Cyclin- MGQIQYSEKYFDDTYEYRHVVLPPDVAKLLPKNRLLSENEWRAIGVQQSRGWVH 101 367
dependent YAIHRPEPHIMLFRRPLNYQQQQENQAQQNMLAK
kinase
regulatory
subunit
286 Histone MGSIDPPKAEQNGTAAAAVADPGQKPGAGDAMPPPPPVKHSNGTAAEPDVATKR 9 1352
acetyltransferase RRMSVLPLEVGTRVMCRWRDGKYHPVKVIERRKLNPGDPNDYEYYVHYTEFNRR
LDEWVKLEQLDLNSVETVVDEKVEDKVTGLKMTRHQKRKIDETHVEGHEELDAA
SLREHEEFTKVKNIATIELGRYEIETWYFSPFPPEYNDCSKLYFCEFCLNFMKR
KEQLQRHMKKCDLKHPPGDEIYRSGTLSMFEVDGKKNKVYGQNLCYLAKLFLDH
KTLYYDVDLFLFYVLCECDDRGCHMVGYFSKEKHSEESYNLACILTLPPYQRKG
YGKFLIAFSYELSKKEGKVGTPERPLSDLGLLSYKGYWTRVLLDILKKHKANIS
IKELSDMTAIKADDILNTLQSLDLIQYRKGQHVICADPKVLDRHLKAAGRGGLE
VDVSKLIWTPYREQG
287 Histone MAQKHSTAPDPAAEPKKRRRVGFSGIDAGVDPNGCFKVYLVSREEEVGAPDSFC 89 1486
acetyltransferase LDPVDLSHFFEEEDGKIYGYEGLKISVWVSCVSFHSYAEIAFESKSDGGKGITD
LNTALKNMFGETLVDNKDDFLQTFSKETQFIRSTVSAGEILKHKHSDDHVNDSV
SNLKVGSDVEAVRMLMGDMTAGHLYSRLVPLVLLLVDGSSPIDVTDSSWELYLL
IQKTSDQQGNFHDRLLGFAAVYRFYHYPDSSRLRLGQILVLPLYQRKGYGRYLL
EVLNNVAIADDVYDFTIEEPVDNLQHLRTCIDVQRLLSFDKVQQAVNSTVSQLK
QGKLSKKTYIPRLLPPPSVVEDARKRFKINKKQFLQCWEILVYLGLDPADKSIQ
DYFSVISNRVRADILGKDSETAGKKVIEVPSDFDPEMSFVMHRAKAGGEANGIQ
VEDNQNKQEEQLQQLIDERLKDIKLIAEKVTQK
288 Histone MAQKHSTAPDPAAEPKKRRRVGFSGIDAGVDPNGCFKVYLVSREEEVGAPDSFC 89 1477
acetyltransferase LDPVDLSHFFEEEDGKIYGYEGLKISVWVSCVSFHSYAEIAFESKSDGGKGITD
LNTALKNMFGETLVDNKDDFLQTFSKETQFIRSTVSAGEILKHKHSDGHVNDSV
SNLKVGSDVEAVRMLMGDMTAGHLYSRLVPLVLLLVDGSNPIDVTDSSWELYLL
IQKTSDQQGNFHDRLLGFAAVYRFYHYPDSLRLRLGQILVLPLYQRKGYGHYLL
EVLNNVAIADDVYDFTIEEPVDNLQHLRTCIDVQRLLSFDKVQQAVNSTVSQLK
QGKLSKKTYIPRLLPPPSVVEDARKRFKINKKQFLQCWEILVYLGLDPADKSIQ
DYFSVISNRVRADILGKDSETAGKKVIEVPSDFDPEMSFVLHRAKAGGETNGIQ
VEDNQNKQEEQLQQLIDERLKDIKLIAQKVSRK
289 Histone MALPMEFWGVEVKAGQPLKVNPGNAKILHLSQASLGECKSSKGNESVPLHVKFG 160 1062
deacetylase DQKLVLGTLSTENFPQLAFDLVFEKEFELSHNWKSGSVYFCGYKSVVHDDDDEF
SDLESDSEEEDLPMIGVENGKVAAQASAKTATASANASKVESSGKQKARIPQPM
KVDEDDSDEDDDDEDEDESDEEGVDGEADSDEEEDESDEEETPKKAEIGKKRAA
DSATKTPVPAKKSKLPTPQKTDGKKGGHTATPHPAKQAGKNPANSANKSQSPKS
AGQVSCKSCSKTFNSDGALQSHSKAKHGGK;
290 Histone MEFWGVEVKAGQPLKVNPGNAKILHLSQASLGECKSSKGNESVPLHVKFGDQKL 172 1077
deacetylase VLGTLSTENFPQLAFDLVFEKEFELSHNWKSGSVYFCGYKSVVHDDDDEFSDLE
SDSEEEDLPMIGVENGKVAAQASAKTATASANASKVESSGKQKASIPQPMKVDE
DDSDEDDDEDDDDEDESDEGVDGEADSDEEEDESDEEETPKKAEIGKKRAADSA
TKTPVPAKKSKLPTPQKTDGKKGGHTATPHPAKQAGKNPANSANKSQSPKSAGQ
VSCKSCSKTFNSDGALQSHSKAKHGGK
291 Histone MEFWGVEVKSGEPLNVEPGAETVVHLSQACLGETKEKTKESVLLYVHIGVQKLV 66 989
deacetylase LGTLSADKFPQIPFDLVFEKSFKLSHNWKNGSVFFSGYKTLLPCGSDADSPYSD
SDTDEGLPINVTAQADVPAKKAPVTANANAAKPNLASAKQKVKIVESNEDGKNE
GDDDEDADVSSDDDAEDDSGDEDMVDGGDESSDEDDDDSEEGESSEEEEPKAQP
SKKRPADSVLKTPASDKKSKLETPQKTDGKKASEHVATPYPSKQAGKAIASKGQ
AKQQTPNSNEFSCKPCNRSFKSDQALQSHNKAKHGGS
292 Histone MDTGGNSLPSGPDGVKRKVCYFYDPEVGNYYLLQHMQVLKPVPARDRDLCRFHA 111 1541
deacetylase DDYVAFLRSITPETQQDQLRQLKRFNVGEDCPVFDGLHSFCQTYAGGSVGGAVK
LNHGLCDIAINWAGGLHHAKKCEASGFCYVNDIVLGILELLKQHERVLYVDIDI
HHGDGVEEAFYTTDRVMTVSFHKFGDYFPGTGDIRDIGYGKGKYYSLNVPLDDG
IDDESYHSLFKPIIGKVMEVFKPGAVVLQCGADSLSGDRLGCFNLSIKGHAECV
RYMRSFNVPVLLLGGGGYTIRNVARCWCYETGVALGLEVDDKMPQHEYYEYFGP
DYTLHVAPSNMENKNSRQLLEEIRSKLLENLSKLQHAPSVPFQERPPDTELPEA
DEDQEDPDERWDPDSDMDVDEDRKPLPSRVKRELIVEPEVKDQDSQKASIDHGR
GLDTTQEDNASIKVSDMNSMITDEQSVKMEQDNVNKPSEQIFPK
293 Histone MDTGGNSLPSGPDGVKRKVCYFYDPEVGNYYYGQGHPMKPHRIRMTHALLAHYG 116 1615
deacetylase LLQHMQVLKPVPARDRDLCRFHADDYVAFLRSITPETQQDQLRQLKRFNVGEDC
PVFDGLHSFCQTYAGGSVGGAVKLNHGLCDIAINWAGGLHHAKKCEASGFCYVN
DIVLGILELLKQHERVLYVDIDIHHGDGVEEAFYTTDRVMTVSFHKFGDYFPGT
GDIRDIGYGKGKYYSLNVPLDDGIDDESYHSLFKPIIGKVMEVFKPGAVVLQCG
ADSLSGDRLGCFNLSIKGHAECVRYMRSFNVPVLLLGGGGYTIRNVARCWCYET
GVALGLEVDDKMPQHEYYEYFGPDYTLHVAPSNMENKNSRQLLEDIRSKLLENL
SKLQHAPSVPFQERPPDTELPEADEDQEDPDERWDPDSDMDVDEDRKPLPSRVK
RELIVEPEVKDQDSQKASIDHGRGLDTTQEDNASIKVSDMNSMITDEQSVKMEQ
DNVNKPSEQIFPK
294 Histone MRPKDRISYFYDGDVGSVYFGPNHPMKPHRLCMTHHLVLSYELHTKMEIYRPHK 155 1453
deacetylase AYPAELAQFHSPDYVEFLHRITPDTQHLFPNDLAKYNLGEDCPVFENLFEFCQI
YAGGTIDAARRLNNQLCDIAINWAGGLHHAKKCEASGFCYINDLVLGILELLKY
HARVLYIDIDVHHGDGVEEAFYFTDRVMTVSFHKFGDMFFPGTGDVKEIGGKEG
KFYAINVPLKDGIDDTSFTRLFKAIISKVVETYQPGAIVLQCGADSLAGDRLGC
FNLSIDGHSECVRFVKKFNLPLLVTGGGGYTKENVARCWVVETGVLLDTELPNE
IPENEYFKYFAPDYSLKIPRGNIVLENLNSKSYLSAIKVQVLENLENIQHAPSV
QMQEVPPDFYIPDFDEDEQNPDERMDQHTQDKQIQRDDEYYDGDNDNDHNMDDS
295 Histone MTVAEDFHVNNRSKMVSQATPESRLTGGEDDNSLHNQVDELLCQELPERQVILE 228 2033
deacetylase FEGTRPKPYFSDHNGGENSALGVRATEDDLNSDVEAEEKQKEMITLEDMYKNDGT
LYDDDEDDSDWEPVKRQVELMRWFCTNCTMVNVEDVFLCDICGEHRDSGILRHG
FYASPFMQDVGAPSVEAEVQESREDHARSSPPSSSTVVGFDEKMLLHSEVEMKS
HPHPERADRLQAIAASLATAGIFPGRCRSLPVREITKEELQMVHSSEHVDAVEM
TSHMFSSYFTPDTYANEHSARAARIAAGLCADLASTIISGRSKNGFALVRPPGH
HAGIKHAMGFCLHNNAAVAALAAQGAGAKKVLIVDWDVHHGNGTQEIFDGNKSV
LYISLHRHEGGNFYPGTGAAHEVGTMGAEGYCVNIPWSRRGVGDNDYVFAFHHI
VLPIASAFAPDFTIISAGFDAARGDPLGCCDVTPAGYAQMTHMLSALSGGKLLV
ILEGGYNLRSISSSAVAVIKVLLGDSPISEIADAVPSKAGLRTVLEVLKIQRSY
WPSLESIFWELQSQWGIFLVDNRRKQIRKRRRVLVPIWWKWGRKSVLYHLLNGH
LHVKTKR
296 Histone MAAAPSSPPTNRVDVFWHDGMLSHDTGRGVFDTGSDPGFLDVLEKHPENPDRVR 110 1258
deacetylase NMVSILKRGPISPFISWHTATPALISQLLSFHSPEYINELVEADKNGGKVLCAG
TFLNPGSWDAALLAAGNTLSAMKYVLDGKGKIAYALVRPPGHHAQPSQADGYCF
LNNAGLAVRLALDSGCKRVVVVDIDVHYGNGTAEGFYQSSDVLTISLHMNHGSW
GPSHPQSGSVDELGEDEGYGYNMNIPLPNGTGDRGYEYAVTELVVPAVESFKPE
MVVLVVGQDSSAFDPNGRQCLTMDGYRAIGRTIRGLADRHSGGRILIVQEGGYH
VTYSAYCLHATVEGILDLPDPLLADPIAYYPEDEAFPVKVVDSIKRYLVDKVPF
LKEH
297 Histone MVESSGGASLPSVGQDARKRRVSYFYEPTIGDYYYGQGHPMKPHRIRMAHNLIV 50 1462
deacetylase HYYLHRRMEISRPFPAATTDIRRFHSEDYVTFISSVTPETVSDPAFSRQLKRFN
VGEDCPVFDGIFGFCQASAGGSMGAAVKLNRGDSDIALNWAGGLHHAKKSEASG
FCYVNDIVLGILELLKVHKRVLYVDIDVHHGDGVEEAFYTTDRVMTVSFHKFGD
FFPGSGHIKDTGAGPGKNYALNVPLNDGIDDESFRGMFRPIIQKVMEVYQPDAV
VLQCGADSLSGDRLGCFNLSVKGHADCLRFLRSFNVPLMVLGGGGYTMRNVARC
WCYETAVAVGVEPENDLPYNEYYEYFGPDYTLHVEPCSMENLNAPKDLERIRNM
LLEQLSRIPHAPSVPFQMTPPITQEPEEAEEDMDERPKPRIWNGEDYESDAEED
KSQHRSSNADALHDENVEMRDSVGENSGDKTREDRSPS
298 MAT1 CDK- MVVPSSNPHNREMAIRRRMASTFNKREDDFPSLREYNDYLEEVEEMTFNLIEGV 176 739
activating DVPTIEAKIAKYQEENAEQIMINRAKKAEEFAAALAASKGLPPQTDPDGALNSQ
kinase assembly AGLSVGTQGQYAPAIAGGQPRPTGMAPQPVPLGTGLDTHGYDDEEMIKLRAERG
factor GRAGGWSIELSKKRALEEAFGSLWL
299 Peptidylprolyl MAAIISCHHYHSCCSSLIASKWVGARIPTSCFGRSSTQSNNAASVRQFVTRCSS 150 1529
isomerase SPSSRGQWQPHQNGEKGRSFSLRECAISIALAVGLVTGVPSLDMSTGNAYAASP
ALPDLSVLISGPPIKDPEALLRYALPINNKAIREVQKPLEDITDSLKVAGLRAL
DSVERNVRQASRVLKQGKNLIVSGLAESKKDHGVELLDKLEAGMDELQQIVEDG
NRDAVAGKQRELLNYVGGVEEDMVDGFPYEVPEEYKNMPLLKGRAAVDMKVKVK
DNPNLEECVFRIVLDGYNAPVTAGNFVDLVERHFYDGMEIQRADGFVVQTGDPE
GPAESFIDPSTEKPRTIPLEIMVDGEKAPVYGATLEELGLYKAQTKLPFNAFGT
MAMARDEFEDNSASSQIFWLLKESELTPSNANILDGRYAVFGYVTENQDFLADL
KVGDVIESVQVVSGLDNLANPSYKIAG;
300 Peptidylprolyl MAGEDFDIPPADEMNEDFDLPDDDDDAPVMKAGDEKEIGKQGLKKKLVKEGDAW 247 1971
isomerase ETPDNGDEVEVHYTGTLLDGTQFDSSRDRGTPFKFTLGQGQVIKGWDQGIKTMK
KGENAIFTIPPELAYGEAGSPPTIPPNATLQFDVELLSWTSVKDICKDGGIFKK
ILVEGEKWENPKDLDEVLVKYEFQLEDGTTIARSDGVEFTVKEGHFCPAVAKAV
KTMKKGEKVLLTVKPQYGFGEKGKPASGDEGAVPPNATLQITLELVSWKTVSEV
TDDKKVIKKILKEGEGYERPNEGAVVEVKLIGKLQDGTVFVKKGHDDCEELFKF
KIDEEQVVDGLDKAVMNMKKGEVALLTVAPEYAFGSSESKQDLAVVPPSSTVYY
EVELVSFVKDKESWDMNTEEKIEAAGKKKEEGNVIFKAGKYAKASKRYEKAVKY
IEYDTSFSEDEKKQAKALKVACNLNDAACKLKLKDYNQAEKLCTKVLELDSRNV
KALYRRAQAYIELSDLDLAEFDIKKALEIDPHNRDVKLEYKVLKEKVKEFNKKD
AKFYGNMFAKMSKLEPVEKTAAKEPEPMSIDSKA;
301 Peptidylprolyl MSTVYVLEPPTKGKVVLNTTHGPLDVELWPKEAPKAVRNFVQLCLEGYYDNTIF 136 1644
isomerase HRIIKDFLVQGGDPTGSGTGGESIYGDAFSDEFHSRLRFKHRGLVACANAGSPH
SNGSQFFITLDRCDWLDRKNTIFGKITGDSIYNLSGLAEVETDKSDRPLDPPPK
IISVEVLWNPFEDIVPRAPVRSLVPTVPDVQNKEPKKKAVKKLNLLSFGEEAEE
EEKALVVVKQKIKSSHDVLDDPRLLKEHIPSKQVDSYDSKTARDVQSVREALSS
KKQELQKESGAEFSNSFREIADDEDDDDDDASFDARMRRQILQKRKELGDLPPK
PKPKSRDGISARKERETSISRDKDDDDDDDQPRVEKLSLKKKGIGSEARGERMA
NADADLQLLNDAERGRQLQKQKKHRLRGREDEVLTKLETFKASVFGKPLASSAK
VGDGDGDLSDWRSVKLKFAPEPGKDRMTRNEDPNDYVVVDPLLEKGKEKFNRMQ
AKEKRRGREWAGKSLT;
302 Peptidylprolyl MASAISMHSSGLLLLQGTNGKDVTEMGKAPASSRVANMQQRKYGATCCVARGLT 48 836
isomerase SRSHYASSLAFKQFSKTPSIKYDRMVEIKAMATDLGLQAKVTNKCFFDVEIGGE
PAGRIVIGLFGDDVPKTVENFRALCTGEKGFGYKGCSFHRIIKDFMIQGGDFTR
GNGTGGKSIYGSTFEDENFALKHVGPGVLSMANAGPSTNGSQFFICTVKTPWLD
NRHVVFGQVVDGMDVVQKLESQETSRSDVPRQPCRIVNCGELPLDG;
303 Peptidylprolyl MAASFTALSNVGSLSSPRNGSEIRRFRPSCNVAASVRPPPLKAGLSASSSSSFS 49 822
isomerase GSLRLIPLSSSPQRKSRPCSVRASAEAAAAQSKVTNKVYLDISIGNPVGKLVGR
IVIGLYGDDVPQTAENFRALCTGEKGFGYKGSTVHRVIKDFMIQGGDFDKGNGT
GGKSIYGRTFKDENFKLSHVGPGVVSMANAGPNTNGSQFFICTVKTPWLDQRHV
VFGQVLEGMDIVRLIESQETDRGDRPRKRVVVSDCGELPVV;
304 Peptidylprolyl MAEAIDLTGDGGVMKTIVRRAKPDAVSPSETLPLVDVRYEGVLAETGEVFDSTH 185 751
isomerase EDNTLFSFEIGKGSVISAWDTALRTMKVGEVAKITCKPEYAYGSTGSPPDIPPD
ATLIFEVELVACKPCKGFSVTSVTEDKARLEELKKQREIAAATKEEEKKRREEA
KAAAAARVQAKLDAKKGHGKGKGKAK;
305 Peptidylprolyl MGNPKVFFDMSIGGQPAGRIVMELYADVVPRTAENFRALCTGEKGAGRSGKPLH 103 621
isomerase YKGSSFHRVIPGFMCQGGDFTAGNGTGGESIYGSKFADENFVKKHTGPGVLSMA
NAGPGTNGSQFFVCTAKTEWLDGKHVVFGQIVDGMDVVKAIEKVGSSSGRTSKP
VVVADCGQLS
306 Peptidylprolyl MPNPKVFFDMTIGGAAAGRVVMELYADTTPRTAENFRALCTGEKGVGRSKKPLH 41 559
isomerase YKGSKFHRVIPSFMCQGGDFTAGNGTGGESIYGVKFADENFIKKHTGPGILSMA
NAGPGTNGSQFFICTTKTEWLDGKHVVFGKVVEGMEVVKAIEKVGSSSGRTSKP
VVVADCGQLP
307 Peptidylprolyl MAEAIDLTGDGGVMKTIVRRAKPDAVSPSETLPLVDVRYEGVLAETGEVFDSTH 127 693
isomerase EDNTLFSFEIGKGSVISAWDTALRTMKVGEVAKITCKPEYAYGSTGSPPDIPPD
ATLIFEVELVACKPCKGFSVTSVTEDKARLEELKKQREIAAATKEEEKKRREEA
KAAAAARVQAKLDAKKGHGKGKGKAK
308 Peptidylprolyl MATARSFFLCALLLLATLYLAQAKKSEDLKEVTHKVYFDVEIAGKPAGRIVMGL 28 639
isomerase YGKAVPKTAENFRALCTGEKGTGKSGKPLHYKGSSFHRIIPSFMLQGGDFTLGD
GRGGESIYGEKFADENFKLKHTGPGLLSMANAGPDTNGSQFFITTVTTSWLDGR
HVVFGKVLSGMDVVYKVEAEGRQSGTPKSKVVIADSGELPL
309 Peptidylprolyl MMRREISVLLQPRFVLAFLALAVLLLVFAFPFSRQRGDQVEEEPEITHRVYLDV 135 812
isomerase DIDGQHLGRIVIGLYGEVVPRTVENFRALCTGEKGKSANGKKLHYKGTPFHRII
SGFMIQGGDVIYGDGKGYESIYGGTFADENFRIKHSHAGIISMVNSGPDSNGSQ
FFITTVKASWLDGEHVVFGRVIQGMDTVYAIEGGAGTYNGKPRKKVIIADSGEI
PKSKWDEER
310 Peptidylprolyl MWATAEGGPPEVTLETSMGSFTVELYFKHAPRTSRNFIELSRRGYYDNVKFHRI 119 613
isomerase IKDFIVQGGDPTGTGRGGESIYGKKFEDEIKPELKHTGAGILSMANAGPNTNGS
QFFITLAPCPSLDGKHTIFGRVCRGMEIIKRLGSVQTDNNDRPIHDVKILRTSV
KD
311 Peptidylprolyl MSNPKVFFDILIGKMKAGRVVMELFADVTPKTAENFRALCTGEKGIGRSGKPLH 38 562
isomerase YKGSTFHRIIPNFMCQGGDFTRGNGTGGESIYGMKFADENFKIKHTGLGVLSMA
NAGPDTNGSQFFICTEKTPWLDGKHVVFGKVIDGYNVVKEMESVGSDSGSTRET
VAIEDCGQLSEN
312 Peptidylprolyl MDDDFEFPASSNVENDDDDGMDMDDMGGDVPEEEDPVASPAVLKVGEEREIGKA 109 1872
isomerase GFKKKLVKEGEGWETPSSGDEVEVHYTGTLLDGTKFDSSRDRGTPFKFKLGRGQ
VIKGWDEGIKTMKKGENAIFTIPPELAYGESGSPPTIPPNATLQFDVELLSWSS
VKDICKDGGILKKVLVEGEKWDNPKDLDEVFVKYEASLEDGTLISKSDGVEFTV
GDGYFCAALAKAVKTMKKGEKVLLTVMPQYAFGETGRPASGDEAAVPPDASLQI
MLELVSWKTVSDVTKDKKVLKKTLKEGEGYERPNDGAAVQVRLCGKLQDGTVFV
KKDDEEPFEFKIDEEQVIDGLDRAVKNMKKGEVALVTIQPEYAFGPTESQQDLA
VVPANSTVYYEVELLSFVKEKESWEMNNQEKIEAAARKKEEGNAAFKAGKYVRA
SKRYEKAVRFIEYDSSFSDEEKQQAKTLKNTCNLNDAACKLKLKDFKEAEKLCT
KVLEGDGKNVKALYRRAQAYIQLVDLDLAEQDIKKALEIDPNNRDVKLEYKILK
EKVREYNKRDAQFYGNMFAKMNKLEHSRTAGMGAKHEAAPMTIDSKA
313 Peptidylprolyl MAKPRCFMDISIGGELEGRIVGELYTDVAPKTAENFRALCTGEKGIGPHTGAPL 74 1159
isomerase HYKGVRFHRVIKGFMVQGGDISAGDGTGGESIYGLKFEDENFDLKHERKGMLSM
ANSGPNTNGSQFFITTTRTSHLDGKHVVFGRVVKGMGVVRSVEHVTTAAGDCPT
VDVVIADCGEIPAGADDGIRNFFKDGDTYPDWPADLDESPAELSWWMDAVDSIK
AFGNGSYKKQDYKMALRKYRKALRYLDICWEKEGIDEVESSSLRKTKSQIFTNS
SACKLKLCDLKGALLDAEFAVRDGENNAKAYFRQGQAHMELNDIDAAAESFSKA
LELEPNDVGIKKELNAAKKKIFERREQEKRAYRKMFL
314 Peptidylprolyl MTKRKNPLVFLDVSIDGDPVERIVIELFADTVPRTAENFRSLCTGEKGVGKTTG 54 2045
isomerase KPLHYKGSYFHRIIKGFMAQGGDFSNGNGTGGESIYGGKFADENFKLAHDGPGL
LSMANGGPNTNGSQFFIIFKRQPHLDGKHVVFGKVMRGMEVVKKIEQVGSANGK
PLQPVKIVDCGETSETGTQDAVVEEKSKSATLKAKKKRSARDSSSESRGKRRQR
KSRKERTRKRRRYSSSDSYSSESSDSDSESYSSDTESESKSHSESSVSDSSSSD
GRRRKRKSTKREKLRRQRGKDSRGEQKSARYDKKSRHKSADSSSDSESESSSRS
RSRDDKKKSSRRESARSVSKLKDAEANSPENLESPRDREIKKVEDNSSHEEGEF
SPKNDVQHNGHGTDAKFGKYDDQRPRSDGSKKSSGSMRDSPKRLANSVPQGSPS
SSPAHKASEPSSSIRARNPSRSPAPDGNSKRIRKGRGFTERFSYARRYRTPSPE
DVTYRPYHYGRRNFHDRRNDRYSNYRSYSERSPHRRYRSPPRGRSPPRYQRRRS
RSRSVSRSPGGNKGRYRGRDQSRSRSRSRSRSPRRGSSPANKQLPLSERLKSRL
GTRVDEHSPRRRRSSSRSHDSSRSRSPDEVPDKHEGKAAPVSPARSRSSSPSGR
GLVSYGDASPDSGIN
315 Peptidylprolyl MSVLLVTSLGDIVVDLHADRCPLTCKNFLKLCRIKYYNGCVFHTVQKDFTAQTG 53 1879
isomerase DPTGTGTGGDSVYKFLYGDQARFFMDEIHLDLKHSKTGTVAMASGGENLNASQF
YFTLRDDLDYLDGKHTVFGEVAEGLETLTRINEAYVDEKGRPYKNIRIRHTYIL
DDPFDDPPQLAELIPDASPEGKPKDEVVDDVRLEDDWVPLDEQLGPAQLEEAIR
AKEAHSRAVVLESIGDIPDAEIKPPDNVLFVCKLNPVTEDEDLHTIFSRFGTVV
SADVIRDFKTGDSLCYAFIEFENKDSCEQAYFKMDNALIDDRRIKVDFSQSVAK
LWSQFKRKDSQAAKGKGCFKCGAPDHMARECPGSSTRQPLSKYILKEDNAQRGG
DDSRYEMVFDEDAPESPSHGKKRRGRDDRDDRHKMSRQSVEETKFNDREGGHSV
DKHRQSERSKHREDEMSRDSKASEAGRRRIDRDFPEEERDGEKYTESHRDRDGK
RGDYRDYRKGEADVQTHGDRRGDENYRRKSAAYDDGHEGAGAARRKDSNDDHHA
YRRGYGDSRKGTRDEDDDGRGRRDDPSYRRSSGHKDSSNGGREEQKYRSGETDG
KSHPERSHRGDRRR
316 Peptidylprolyl MRPFNGGSSIACLVLVIAAGALAESQGPHLGSARVVFQTNYGDIEFGFFPGVAP 7 690
isomerase RTVDHIFKLVRLGCYNTNHFFRVDKGFVAQVADVANGRTAPMNDEQRTEAEKTI
VGEFSNVKHVRGILSMGRYDDPDSAQSSFSILLGDAPHLDGKYAIFGRVTKGDE
TLKKLEQLPTRREGMFVMPTERITILSSYYYDTGAESCEEENSTLRRRLAASAV
EVERQRMKCFP
317 Peptidylprolyl MPNPKVFFDMQVGGAPAGRIVMELYADVVPKTAENFRALCTGEKGTGRSGKPLH 83 601
isomerase FKGSSFHRVIPGFMCQGGDFTRGNGTGGESIYGEKFADENFVKKHTGPGILSMA
NAGPNTNGSQFFICTAQTSWLDGKHVVFGQVVEGLEVVRDIEKVGSGSGRTSKP
VVIADSGQLA
318 Peptidylprolyl MRFTSITSAIALFAAAASALDKPLDIKVDKAVECSRKTKAGDKIQVHYRGTLEA 125 535
isomerase DGSEFDASYKRGQPLSFHVGKGQVIKGWDQGLLDMCPGEKRTLTIQPDWGYGSR
GMGPIPANSVLIFETELVEIAGVAREEL
319 Peptidylprolyl MGNPKVFFDMSIGGQPAGRIVMELYADVVPRTAENFRALCTGEKGAGRSGKPLH 55 573
isomerase YKGSSFHRVIPGFMCQGGDFTAGNGTGGESIYGSKFADENFVKKHTGPGVLSMA
NAGPGTNGSQFFVCTAKTEWLDGKHVVFGQIVDGMDVVKAIEKVGSSSGRTSKP
VVVADCGQLS
320 Peptidylprolyl MAVATRSRWVAMSVAWILVLFGTLALIQNRLSDTGASSDPKLVHRKVGEEKKKP 147 842
isomerase DDLEEVTHKVFFDVEIGGKPAGRIVMGLFGKTVPKTVENFRALCTGEKGIGKSG
KPLNYKGSQFHRIIPKFMIQGGDFTLGDGRGGESIYGNKFSDENFKLKHTDAGR
LSMTNAGPDTNGSQFFITTVTTSWLDGRHVVFGKVLSGMDVVHKIEAEGGQSGQ
PKSIVVISDSGELDL
321 Peptidylprolyl MAVTLHTNLGDIKCEIFCDEVPKAAEHNARGILSMANSGPNTNGSQFFIAYAKQ 167 487
isomerase PHLNGLYTIFGRVIHGFEVLDIMEKTQTGPGDRPLAEIRLNRVTIHANPLAG
322 Peptidylprolyl MAVATRSRWVAMSVAWILVLFGTLALIQNRLSDTGASSDPKLVHRKVGEEKKKP 195 890
isomerase DDLEEVTHKVFFDVEIGGKPAGRIVMGLFGKTVPKTVENFRALCTGEKGIGKSG
KPLNYKGSQFHRIIPKFMIQGGDFTLGDGRGGESIYGNKFSDENFKLKHTDAGR
LSMANAGPDTNGSQFFITTVTTSWLDGRHVVFGKVLSGMDVVHKIEAEGGQSGQ
PKSIVVISDSGELDL
323 Peptidylprolyl MGNPKVFFDMSIGGQPAGRIVMELYADVVPRTAENFRALCTGEKGAGRSGKPLH 68 586
isomerase YKGSSFHRVIPGFMCQGGDFTAGNGTGGESIYGSKFADENFVKKHTGPGVLSMA
NAGPGTNGSQFFVCTAKTEWLDGKHVVFGQIVDGMDVVKAIEKVGSSSGRTSKP
VVVADCGQLS
324 Retinoblastoma MSPVAANAMEEAAEPEVPAPVTPSKDDADTDAAVSRFLGFCKSKLGLAEGNCVQ 182 3265
related protein SSTLLRKTAHVLRSSGTVIGTGTAEEAERYWFAFVLYTVRRVGERKAEDEQNGS
DETEVPLSRILKASVLNLIDFFKEIPQFVIKAGAIVSGIYGANWDSRLEAREMQ
TNYVHLCILCKFYKRICGEFFILNDAKDDMKSADSSTSDPVIMYQPFGWLLFLA
LRIHALSRFKDLVSSTNALVSVLAILIIHLPTRFRKFSISDSSQLVKRSEKGVD
LVGSLAYRYDTSEDEIKRTLEKANNVIAEILGITPPPASECKAENLENVDTDGL
IYFGNLMEETSLSSILSTLEKIYEDATRNDSEFDERVFINDDDSLLVSGSLSGA
AINLTGAKRKYDSFASPAKTITRPLSPSRSPASHINGIIGGTNLRITATPVATA
MTTAKWLRTFVSPLPSKPSTDLQGFLASCDRDVTSDVIRRANIILEAIFPNSPI
GERTVTGGLQNANLMDNMWAEQRRLEALKLYYRVLEAMCRAEAQILHSNNLTSL
LTNERFHRCMLACSAELVLATHKTVTMLFPAVLERTGITAFDLSKVIESFVRHE
ETLPRELRRHLNTLEERLLENMVWERGSSMYNSLVVARPALAPEINRLGLLPEP
MPSLDAIALLINFSSSGLPQSPVQKHEASPGQNGDIRSPKRISTEYRSVLVERN
FTSPVKDRLLALSNIKSKLPPPPLQSAFASPTRPHPGGGGETCAETAIHIFFSK
ITKLAAVRINAMLERLQLSQQIKEGVYCLFQQILSQRTNLFFNRHIDQVILCCF
YGVAKINQINLTFREIIYNYRKQPQCKPQVFRNVFVDWSTRRNGKAGNEHVDII
SFYNEIFIPSVKPLLVELGPTGATTRTNRTSEVGNKNDAQCPGSPKISSFPTLP
DMSPKKVSASHNVYVSPLRSSKMDASISHSSKSYYACVGESTHAYQSPSKDLVA
INSRLNGNRKVRGTLNFDDVDAGLVSDSMVANSLYLQNGSSMSSSTAKSSEKPES
325 WD40 repeat MRPILMKGHERPLTFLKYNREGDLLFSCAKDHTPTVWFADNGERLGTYRGHNGA 165 1145
protein VWCCDVSRDSMRLITGSADTTAKLWSVQNGTQLFTFNFDSPARSVDFSIGDKLA
VITTDPFMELPSAIHVKRIARDPADQASESVLVLRGHQGRIARAVWGPLNKTII
SAGEDAVIRIWDSETGKLLRESDKETGHKKAVTSLMKSVDGSHFVTGSQDKSAK
LWDIRTLTLIKTYVTERPVNAVTMSPLLDHVVLGGGQDASAVTMTDHRAGKFEA
KFFDKILQEEIGGVKGHFGPINALAFNPDGKSFSSGGEDGYVRLHHFDPDYFNI
KI
326 WD40 repeat MDKKRTVVPLVCHGHSRPVVDLFYSPITPDGFFLISASKDSSPMLRNGETGDWI 529 1569
protein GTFEGHKGAVWSCCLDTNALRAASGSADFSAKLWDALSGDELHSFEHKHIVRSC
AFSEDTHLLLTGGVEKILRIFDLNRPDAPPREVDNSPGSIRTVAWLHSDQTILS
SCTDIGGVRLWDVRSGKIVQTLETKSPVTSSEVSQDGRYITTADGSTVKFWDAN
HFGLVKSYNMPCNIESASLEPKLGNKFIAGGEDMWVHIFDFHTGEEIGCNKGHH
GPVHCVRFSPGGESYASGSEDGTIRIWQTGPANNVEGDANPSNGPVTGKAKVGA
DEVTRKVEDLQIGKEGKDWREG
327 WD40 repeat MAEGLILKGTMRAHTDMVTAIAIPIDNSDMVVTSSRDKSIILWHLTKEEKVYGV 156 1136
protein PRRRLTGHSHFVQDVVLSSDGQFALSGSWDGELRLWDLATGVSARRFVGHTKDV
LSVAFSIDNRQIVSASRDRTIKLWNTLGECKYTIQEGEAHTDWVSCVRFSPNTL
QPTIVSASWDRTIKVWNLTNCKLRNTLAGHNGYVNTVAVSPDGSLCASGGKDGV
ILLWDLAEGKRLYNLEAGAIIHSLCFSPNRYWLCAATENSIKIWDLESKSIVED
LRVDLKNEADKTDGTTTAASNKKVIYCTSLNWSADGSTLFSGYNDGVIRVWGTG
RY
328 WD40 repeat MAEGLHLKGTMKAHTDMVTAIAVPIDNADMIVTSSRDKSIILWHLTKEDKVYGV 90 1073
protein PRRRLTGHSHFVQDVVLSSDGQFALSGSWDGELRLWDLATGVSARRFVGHTKDV
LSVAFSIDNRQIVSASRDRTIKLWNTLGECKYTIQEGEAHNDWVSCVRFSPNTL
QPTIVSASWDRTVKVWNLTNCKLRNTLQGHSGYVNTVAVSPDGSLCASGGKDGV
ILLWDLAEGKKLYSLEAGAIIHSLCFSPNRYWLCAATENSIKIWDLESKSIVED
LRVDLKNEADMSDGTTGAMSSNKKVIYCTSLNWSADGSTLFSGYNDGVIRVWGI
GRY
329 WD40 repeat MAEGLHLKGTMKAHTDMVTAIAVPIDNADMIVTSSRDKSIILWHLTKEDKVYGV 66 1049
protein PRRRLTGHSHFVQDVVLSSDGQFALSGSWDGELRLWDLATGVSARRFVGHTKDV
LSVAFSIDNRQIVSASRDRTIKLWNTLGECKYTIQEGEAHNDWVSCVRFSPNTL
QPTIVSASWDRTVKVWNLTNCKLRNTLQGHSGYVNTVAVSPDGSLCASGGKDGV
ILLWDLAEGKKLYSLEAGAIIHSLCFSPNRYWLCAATENSIKIWDLESKSIVED
LRVDLKNEADMSDGTTGAMSSNKKVIYCTSLNWSADGSTLFSGYNDGVIRVWGI
GRY
330 WD40 repeat MSGVPAPPFATTTPENGTMSSNSPAFHRDSDDDDDQGEVFLDDSDIIHEVAVDD 227 1512
protein EDLPDADDEADEAEEADDSLHIFTGHNGEVYSLACSPTDATLVATGAGDDKGFL
WRIGHGDWAVELQGHKDSISSLAFSLDGQLLASGSLDGVIQIWDVPSGNLKGTL
DGPGGGIEWIRWHPKGHIILAGSEDSTVWMWNADKMAYLNMFSGHGNSVTCGDF
TPDGKTICTGSDDATLRIWNPKSGENIHVVKGHPYHAEGLTSMAISSDSGLAIT
GAKDGSVRIVNISSGRVVSSLDAHADSVEFVGLALSSPWAATGSLDQKLIIWDL
QHSSPRATCDHEDGVTCLSWVGASRFLASGCVDGKVRVWDSLSGDCVRTFHGHS
DAIQSLSVSANEEFLVSVSIDGTARVFEIAEFH
331 WD40 repeat MGTSQHQLSSCLQLLPRRRGNKNLIFRRTMASGGAAAVAPPPGYKPYRHLKTLT 33 1076
protein GHVAAVSCVKFSNDGTLLASASLDKTLIIWSSAALSLLHRLVGHSEGVSDLAWS
SDSHYICSASDDRTLRIWSSRSPFDCLKTLRGHTDFVFCVNFNPQSSLIVSGSF
DETIRIWEVKTGRCLNVIRAHSMPVTSVHFNRDGSLIVSGSHDGSCKIWDTKNG
ACLKTLIDDTVPAVSFAKFSPNGKFILVATLNDTLKLWNYATGKFLKIYTGHKN
SVYCLTSTFSVTNGKYIVSGSEDRCICIWDLQGKNLIQKLEGHSDTVISVTCHP
SENKIASAGLDSDRTVRIWLQDA
332 WD40 repeat MPSQKIETGHQDIVHDVAMDYYGKRVATASSDTTIKIIGVSNSSGSQHLASLSG 65 973
protein HKGPVWQVAWAHPKFGSILASCSYDGQVILWKEGNQNDWAQAHVFNDHKSSVNS
IAWAPHELGLCLACGSSDGNISVFTARPDGGWDTTRIEQAHPVGVTSVSWAPSM
APGALVGSGLLDPVQKLASGGCDNTVKVWKLYNGTWKMDCFPALQMHSDWVRDV
AWAPNLGLPKSTIASASQDGTVVIWTVAKEGEQWQGKVLKDFKTPVWRVSWSLT
GNLLAVADGNNNVTLWNEAVDGEWQQVTTVEP
333 WD40 repeat MKIAGLKSVENAHDESVWAAAWVPATESRPALLLTGSLDETVKLWRPDELALER 82 1047
protein TNAGHFLGVVSVAAHPSGVIAASASIDSFVRVFDVDTNATIATLEAPPSEVWQM
QFDPKGTTLAVAGGGSASIKLWDTATWELNATLSIPRPEQPKPSEKGNKKFVLS
VAWSPDGRRLACGSMDGTISIFDVARAKFLHHLEGHFMPVRSLVFSPVEPRLLF
SASDDAHVHMYDSEGKSLVGSMSGHASWVLSVDVSPDGAALATGSSDRTVRLWD
LSMRAAVQTMSNHSDQVWGVAFRPMAGAGVRAGGRLASVSDDKSISLYDYS
334 WD40 repeat MEIDLGNLAFDVDFHPSEQLVASGLITGDLLLYRYGDGSSPEKLLEVRAHGESC 43 1101
protein RAVRFINDGKAILTGSPDCSILATDVETGSVVARVENAHEAAVNRLVNLTESTI
ATGDDNGCIKVWDTRQRSCCNTFSAHEDFISDMTFASDSMKLVVTSGDGTLSVC
NLRSNKVQTRSEFSEDELLSVVIMKNGRKVVCGTQSGTLLLYSWGFFKDCSDRF
VDLSPSSVDALLKLDEDRIIAGTENGLISLIGILPNRIIQPIAEHSDHPIERLA
FSHDKKFLGSISHDQTLKLWDLNDILGSEDSPSSQAAIDDSDSDEMDVDANPPD
SSKGNKKKHSGKGNDVGNANNFFADLGD
335 WD40 repeat MSQQPSVILATASYDHTIRFWEAKSGRCYRTIQYPDSQVNRLEITPHKRYLAVA 142 1095
protein GNPSIRLFDVNSNTPQPVMSFDSHTNNVMAVGFQYDGNWMYSGSEDGTVRTWDL
RARGCQREYESRGAVNTVVLHPNQTELISGDQNGNIRVWDLTANSCSCELVPEV
DTAVRSLTVMWDGSLVVAANNNGTCYVWRLLRGSQTMTNFEPLHKLQAHNGYIL
KCLLSPEFCEPHRYLATASSDHTVKIWNVEGFTLEKTLIGHQRWVWDCVFSVDG
AYLITASSDTTARLWSMSTGQDIRVYQGHHKATTCCALHDGAEGSPG
336 WD40 repeat MEDAMDMEVEVEVEAEEHSPSSSNPSGSSFRRFGLKNSIQTNFGSDYVFEITPK 61 1257
protein FDWSLMGVSLSSNAVKLYSPTTGQYCGECRGHSDTVNGISFSGPSSPHVLHSCS
SDGTIRAWDTRSFKEVSCISAGPSQEIFSFSFGGSSDSLLSAGCKSQILFWDWR
NKKQVACLEDSHVDDVTQVCFVPHHQNKLISASVDGLICIFDTAGDINDDEHME
SVINVGTSIGKVGIFGQTFEKLWCLTHIETLSVWDWKEGTNEANFEDARKLASD
SWSLDHIDYFVDCHSAEEGEGLWVIGGTNAGTLGYFPVKYKGGAAIGSPEAVLG
GGHSDVVRSVLPMSGMAGTTSKTRGIFGWTGGEDGRLCCWLSDDSSATSRSWMS
SNLVLKSSRSHHKKNRHQPY
337 WD40 repeat MSQHQEYPMEYAADDYDVGEVEDDMYFHERVMGDSDTDEDEEYDHLDNKITDTS 193 1527
protein AADARRGKDIQGIPWERLSVTREKYRRTRIEQYKNYENVPQSGESSEKDCKPTR
KGGNYYEFWRNTRSVKSTILHFQLRNLVWSTTKHDVYLMSHFSIIHWSSLTCKK
TEVLDVYGHVAPREKHPGSLLEGFTQTQVSTLAVRDKLLIAGGFQGELICKNLD
RPGVSYCCRTTYDDNAITNAVEIYDYPSGAVHFMASNNDCGVRDFDMEKFELSR
HFTFPWPVNHTSLSPDGKLLVIVGDNPEGIVVDSQRGKTIRPLQGHLDFSFASA
WHPDGHIFATGNQDKTCRIWDIRNLSKSVAVLKGNLGAIRSIRFTSDGRFMAMA
EPADFVHVYDVKSGYEKEQEIDFFGEISGVSFSPDTESLFVGVWDRTYGSLLQY
NRCRNYSYLDSM
338 WD40 repeat MGASSDPNPDVSDEHQKRSEIYTYEAPWHIYAMNWSVRRDKKYRLAIASLLDHP 109 1155
protein AAAAAVPNRVEIVQLDDSTGEIRADPNLSFDHPYPATKAAFVPDKDCQRADLLA
TSSDFLRIWRIADDSSRVDLRSFLNGNKNSEFCRPLTSFDWNEAEPKRIGTSSI
DTTCTIWDIERETVDTQLIAHDKEVYDIAWGGVSVFASVSADGSVRVFDLRDKE
HSTIIYESSEPDTPLVRLGWNKQDPRYMATIIMDSAKVVVLDIRYPTMPVVELQ
RHQASVNAIAWAPHSSCHICTAGDDSQALIWDLSSMAQPVEGGLDPILAYTAGA
EIEQLQWSSSQPDWVAIAFSLKLQ
339 WD40 repeat MRGGGGGGDATGWDEDAYRESVLKEREVQTRTVFRAAFAPSPSPSPSPDAVVVA 71 1213
protein SSDGSVASYSISACLSDHRLQSLRFADAKSQNVLEAEPACFLQGHDGPAYDVKF
YGEGEDSLLLSCGDDGRIRGWMWRDITSSEAHDHSQGNSAKPVLDLVNPQSRGP
WGALSPIPENNALAVDVKRGSIYAAAGDSCAYCWDVECGKIKTVFKGHSDYLHC
IAARNSSSQIITGSEDGTARIWDCRSGKCVQVIDPDKDHKKGFFASVSCLALDA
SESWLVCGRGRDLSVWSISASDCIAKISTNAPAQDVLFDDNQILLVGAEPLISR
LDMNGAVLSQIHCAPQSVFSVSLHQSGVTAVGGYGGLVDVISQFGSHLCTFRCK
CI
340 WD40 repeat MEAPIIDPLQGDFPEVIEEYLEHGIMKCIAFNRRGTLLAAGCTDGSCIIWDFET 109 1785
protein RGVAKELRDKECTAAITSVCWSKYGHRILVSASDKSLILWDVLSGEKIAHTTLQ
HTVLQACLHPGSSTPSICLACPFSSAPMIVDLNTGSTTALPVLTADVSNGATPL
SRNKTSDTSVTYSPCNACFNKHGDLVYAGTSKGEILIIDHKNVRVCAIVLVSGG
AVIKNVVFSRNGQYMLTNSNDRLIRIYKNLLPPKDGLKMLDELNESFNESDDVE
KLKAIGSKCLELLHEFQDSITRVQWKAPCFSGDGEWVIGGAASRGEHKIYIWDR
AGHLVKILEGPKEALMDLAWHPVHPIIISVSLTGLVYIWAKDYTENWSAFAPDF
KELEENEEYVEREDEFDLVPETEKVKGLDVHEDDEVDVLTVERDSVFSDSDMSQ
EELCFLPAVPCLDIPEQQDKCVGSCSKLPDGNHSGSPLSVEAGQNGNASNHNSS
PLEPMENSTADDTDGVRLKRKRKPSEKGLELQAEKVKKPVKPLKSSGRLSKTNK
PVIDPDSSNGVYGDDGSD
341 WD40 repeat MRGVSWPEDGNNPSTSSSSQRNQQQAHAPRAVSGHAASHPSASNIFKLLVQREV 364 2685
protein SPRSKHSSKKLWREASKCQPYPFQQSCEAVRDVRQGLISWVESASLRHLSAKYC
PLVPPPRSTIAAAFSPDGKILASTHGDHTVKLIDSQTGSCLKVLRGHRRTPWVV
RFHPLYPEILASGSLDHEVRLWDANTAECIGSRNFYRPIASIAFHARGELLAVA
SGHKLYIWHYNRRGETSSPTIVLRTQRSLRAVHFHPHAAPFLLTAEVNDLDSAD
SAMTLATSPGYLHYPPPTVYFADAHSHERSRLADELPLMPLPLLMWPSFTRDDG
RVPLQRIDGDVGLNGQQRVDSSSSVRLWTYSTPSGQYELLLSPVESGNSPSMPE
ETGNNAFSSAVEAEVSQSAMDTVEDMEVQPEERNTQFFSFSDPRFWELPLLHGW
LVGQTQAGPRSVRQSSPGDIETQSAFGEVASVSPITSGVMPVSMDPSRFGGRSG
SRYRSPGSRGVHVTGPNNDGPRDENDPQSVVSKLRSELAASLAAAASTELPCTV
KLRIWPHDVKDPCAQLDLESCRLTIPHAVLCSEMGAHFSPCGRFLAACVACVLP
HLESDPGLHGQVNQDVTGVATSPTRHPISAHQIMYELRIYSLEEATFGIVLASR
PVRAAHCLTSIQFSPTSEHLLLAYGRRHSSLLKSIVIDGENTVPIYTILEVYRV
SDMELVRVLPSAEDEVNVACFHPSVGGGLIYGTKEGKLRILHYDSSHGLNLKSS
GFLDENVPEVQTYALEC
342 WD40 repeat MDSAVAIAALSLVVGAAIALLFFGNYFRKRRSEVVAMAEADLQPHPKNPSRPPP 96 1412
protein QPAAKKVHAKSHAHGADKDKNKRHHPLDLNTLKGHGDSVTGLCFASDGRSLATA
CADGVVRVFKLDDASNKSFKFLRINLPAGGHPTAVAFGDGVSSVIVASQHLSGC
SLYMYGEEKPTNLDSNKQQTKLPMPEIKWEHHKVHEQKAILTLSGAAANYDSGD
GSTIIASCSEGTDIIIWHAKTGKILGNVDTNQLKNTMSAISPNGRFIAAAAFTA
DVKVWEIVYSKDGSVKGVTKVMQLKGHKSAVTWLCFTPNSEQIVTASKDGSIRI
WNINVRYHLDEDTKTLKVFPIPLQDSSGTTLHYERLSLSPDGKILAATHGSMLQ
WLCIETGKVLDTAEKAHDGDITCMSWAPQSIPTGDKKVNVLATASGDKKVKLWA
APPLPS
343 WD40 repeat MEVEPKKASKTFPVKPKLKPKPRTPSGKTPESKYWSSFKTTHPLDNLSFSVPSL 116 1702
protein AFSPSPPHLLAAAHSATVSLFSPHRTTISSFSDVVSSLSFRSDGQLLAASDLSG
LIQVFDVRSRTPLRRLRSHARPVRFVRYPVLDKLHLVSGGDDALVKYWDVAGES
VVSELRGHKDYVRCGDCSPADANCFVTGSYDHVVKLWDVRVRDGNRAATEVNHG
SPVQDVIFLPSGSLVATAGGNSVKIWDLIGGGRMVYSMESHNKTVTSICVGTMG
AQQSGEEGVQLRILSVGLDGYMKVFDYSRMKVTHSMRFPAPLLSIGFSPDSNVR
AIGTSNGILYVGKRKAKENAEGGANGILGLGSVEEPRRRVLKPSFYRYFHRGQS
EKPSEGDYLVMRPKKVKLAEHDKLLKKFQHKNALISVLGGNDPEKVVAVMEELV
ARRALLKCVLNLDADELGLILTFLHKNSTVPRYSSLLLGLAKKVIDLRLEDIRA
SDALKGHIRNLKRSVDEEIRIQEGLQEIQGMVSPLLRIAGRR
344 WD40 repeat MQGGSSGVGYGLKYQARCISDVKADTDHTSFLTGTLSLKEENEVHLLRLSSGGT 46 1101
protein ELICEGLFSHPSEIWDLSSCPFDQRIFSTVFSTGESYGAAVWQIPELYGQLNSP
QLEKIASLDAHSRKISCVLWWPSGRHDKLVSIDEENIFLWGLDCSKKSAQVQSQ
ESAGMLHNLSGGAWDPHDVNTVAATCESSIQFWDLRTMKKANSLESVHARDLDY
DMRKKHLLVTSEDESGVRVWDLRMPKAPIQEFPGHTHWTWAVRCNPDYEGLILS
AGTDSAVNLWWSSTASSDELISERLIDSPTRKLDPLLHSYNDYEDSVYGLAWSS
REPWIFASLSYDGRVVVESVKPFLSRK
345 WD40 repeat MAEEEGSAELEQQLEEEFAVWKKNTPILYDLLISHALEWPSLTVHWAPLLPQPS 23 1258
protein SSAAAAAGDPSLAAHRLVLGTHTSDGAPNFLILADALLPSSESDHCGDDAVLPK
VEISQKIRVDGEVNRARFMPQNHNIVGAKTNGCEVYVFDCSKQAAKQHDGGFDP
DLRLTGHDGEGYGLSWSPLKENYLLSASHDKKICLWDISAAAQDKVLGAMHVFE
AHEGAVGDASWHSKNDNLFGSAGDDCQLMIWDLRTNKAQQCVKAHEKEVNSVSF
NSYNDWILATASSDTTVGLFDMRKLTTPLHVFSSHEGEVLQVEWDPNHEAVLAS
SSEDRRVMVWDLNRIGDEQQEGDASDGPAELLFSHGGHKAKISDFSWNKNEPWV
ISSVAEDNSVQVWQMAESICGDDDDMQAMEGYI
346 WD40 repeat MGNYGEEDEDQYFDALEETASVSDRGSNSSDCCSSGSGLDENVLDSLGFEFWTK 404 2644
protein FPESVRARRNRFLMLTGLGIEANSVDKEDAFPPSCNEIEVYTCKVTRDDGAVQR
SLDSYNCISLLQSSTSIRSNQEVESLRGDSLLSSFRGRSKESDDLTELCGMGCP
ESKRNAVSEFGSVSQGSIEELRRIVASSPLVHPLLHRKLEYERELIETKQKMGA
GWLRKFGSATCISGRQGDTWSDPDDLEITAGMKMRRVRAHSSKKKYKELSSLYA
AQEFLAHEGSISTMKFSMDGQYLASAGEDTVVRVWKVTEEDRSERVNVTVDPSC
LYFALNESTQLASLNTNKEHIGKAKTFQRSSDSSCVILPLKVFQITEKPWHEFK
GHNGEVLDLSWSSKGYLLSSSTDKTVRLWRVGCDRCQRVYSHNDYVTCISFNPV
NENFFISGSIDGKVRIWNVFGGQVVAYIDCREIVSAVCYRSDGKGAIVGTMTGN
CLFYSIKDNHLQMDAQVYLHGKKKSPGKRITGFQFPPNDPGKLMITSADSVIRV
LSGLDVVCKLKGPRNSGGPMIATFTSDGKHVISASEDSNVYIWNYAGQDKTSSR
VKKIWSCESFWSSNASVALPWCGIRTVPEALAPPSRSEERRASCAENGENHHML
EEYFQKMPPYSPDCFSLSRGFFLELLPKGSATWPEEKLSDTSPPTVSSQAISKL
EYKFLKSACHSVLSSAHMWGLVIVTAGWDGRIRTYHNYGLPVRS
347 WD40 repeat MDIDFKEYRLRCELRGHEDDVRGVCVCGDGSIGTSSRDRTVRLWAPSAGERRKY 107 2383
protein EVARVLLGHKSFVGPLAWVPPSEELPEGGIVSGGMDTLVMAWDLRNGEAQTLKG
HQLQVTGIVLDGGDIVSASVDCTLIRWKNGQLTEHWEAHKAPIQAVIRLPSGEL
VTGSSDTTLKLWRGKTCTQTFVGHTDTVRGLAVMPDLGILSASHDGSIRLWAVS
GECLMEMVDHTSIVYSVDSHASGLIVSGSEDRFAKIWKDGVCFQSIEHPGCVWD
VKFLEDGDIVTACSDGTIRIWTNQEDRMANSTELELFDLELSSYKRSRKRVGGL
KLEELPGLEALQVPGTSDGQTKVIREGDNGVAYAWNSTELKWDKIGEVVDGPED
SMNRPALDGVQYDYVFDVDIGDGEPTRKLPYNRSDNPYDTADKWLLKENLPLSY
RQQIVEFILANSGQRDFNLDPSFRDPYTGSSAYVPGAPSQLAAKQARPTFKHIP
KKGMLVFDAAQFDGILKKINEFNNTLLSNQEKKNLSLTDIEISRLGAVVKILKD
TSHYHSSKFADADFDLMLKLLESWPYEMMFPVIDIFRMVILHPDGADGLLRHQE
DKKDVLMESIKRATGNPSVPANFLTSIRAVTNLFKNSAYYSWLQKHRSEMLDAF
SSCSSSSNKNLQLSYATLLLNYAVLLIEKKDEEGQSQVLSAALELAENESLEVD
ARYRALVAIGSLMLDGLVKRIALDFDVEHIAKAARTSKEAKIAEVGADIELLIK
QS
348 WD40 repeat MEFTEAYKQSGPCCFSPNARFIAVAVDYRLVIRDTLSLKVVQLFSCLDKISYIE 243 1625
protein WALDSEYILCGLYKRPMIQAWSLIQPEWTCKIDEGPAGIAYARWSPDSRHILTT
SDFQLRLTVWSLVNTACVHVQWPKHASKGVSFTRDGKFAAICTRHDCKDYINLL
SCHNWEIMGVFAVDTLDLADIQWSPDDSAIVIWDSPLEYKVLVYSPDGRCLFKY
QAYESGLGVKSVSWSPCGQFLAVGSYDQMLRVLSHLTWKTFAEFTHLSNVRAPC
CAAIFKEVDEPLQIDMSELSLSDDYMQGNSGDAPEGHYRVRYDVTEVPITLPCQ
KPPADRPNPKQGIGLMSWSNDSQYICTRNDSMPTILWIWDMRHLELAAILVQKD
PIRAAVWDPTGTRLVLCTGSSHLYMWTPSGAYCVSVPLSQFNITDLKWNSDGSC
LLLKDKESFCCAAAPLPPDESSDYSSDD
349 WD40 repeat MATIAALDDDMVRSMSIGAVFSDFVGKLNSLDFHRKDDILVTAGEDDSVRLYDI 126 1127
protein ANARLLKTTFHKKHGTDRVCFTHHPNSLICSSTKNLDTGESLRYISMYDNRSLR
YFKGHKQRVVSLCMSPINDSFMSGSLDHSVRMWDLRVNACQGILRLRGRPTVAY
DQQGLVFAVAMEGGAIKLFDSRSYDKGPFDAFLVGGDTSEVCDIKFSNDGKSVL
LSTTNNNIYVLDAYAGDKQCGFNLEPSPSTPIEASFSPDGQYVVSGSGDGTLHA
WNISRRNEVACWNSHIGVASCLKWAPRRAMFVAASTVLTFWIPNSEPELASAKG
EAGVPPEQV
350 WD40 repeat MSVAELKERHRAATETVNSLRERLKQKRVQLLDTDVAGYARTQGKTPVTFGATD 257 1390
protein LVCCRTLQGHTGKVYSLDWTPERNRIVSVSQDGRFIVWNALTSQKTHAIRLPCA
WVMTCAFAPNGQSVACGGLDSVCSIFNLNSPVDRDSNLPVSRMLSGHKGYVSSC
QYVPDGDAHLITGSGDQTCVLWDITTGLRTSVFGGEFQSGHTADVLSVSTNGSS
PRIFVSGSCDSTARMWDTRVASRAVHTYHGHESGVNAVKFFPDGNRFGTGSDDG
TCRLFDIRTGHELQVYYQQRGIDEIPHVTSIAFSISGRLLIAGYSNGDCFVWDT
LLAQVVLNLGSLQNSHEGRISCLGVSADGSALCTGSWDTNLKIWAFGGIRRVT
351 WD40 repeat MKKRPRGASLDQAVVDIRRREVGGLSGLSFARRLAASEGLVLRLDIYNKLKGHR 178 1632
protein GCVNTVGFNLDGDIVISGSDDRHVKLWDWQTGKVKLSFDSGHLSNVFQAKIMPY
TDDRSIVTCAADGQARHAQILEGGQVQTMLLAKHRGRAHKLAIDPGSPHIVYTC
GEDGLVQRLDLRSNTARELFTCREVYGTHVEVVHLNAIAIDPRNPNLFVIGGSD
EYARVYDIRNYKWNGSHNFGRSANYFCPSHLIGEAHVGITGLAFSGQSELLVSY
NDESIYLFTQEMGLGPDPLSASTKSVDSNSSEVTSPTAVNVDDNVTPQVYKGHR
NCETVKGVGFFGPKCEYVVSGSDCGRIFIWKKKGGQLIRVMAADKHVVNCIEPH
PHIPALASSGIENDIKIWTPKAIERATLPMNVEQLKPKARGWMNRISSPRQLLL
QLYSLERWPEHGGETSSGLAAGQEELTELFFALSANGNGSPDGGGDPSGPLL
352 WD40 repeat MSKRGYKLQEFVAHSSNVNCLSIGKKACRLFLTGGDDCKVNLWAIGKPNSLMSL 290 2917
protein CGHTNAVESVAFDSAEVLVLAGASSGVIKLWDVEEAKMVRGLTGHRSNCTAMEF
HPFGEFFASGSTDTNLKIWDIRKKGCIHTYKGHTRGISTIRFSPDGRWVVSGGN
DNVVKVWDLTAGKLLHDFKFHENHIRSIDFHPLEFLLATGSADRTVKFWDLETF
ELIGSSRPEAAGVRAIAFHPDGRTLFCGLEDSLKVYSWEPVICHDGVDMGWSTL
ADLCIHDGKLLGCSYYQSSVGVWVADASLIEPYGTNVKPQQKDSGDDEIEHQES
RPSAKVGTTIRSTSIMRCASPDYETKDIKNIYVDTASGNPVSSQRVGTTNFAKV
TQPLDFNDTPNLTLRRQGLVTETPDGLSGHVPSKSITQPKVVSRDSPDGKDSSR
RESITFSRTKPGMLLRPAHSRRPSSTKYDVDRLSACAEIGVLSSAKSGSESLVD
SFLNIKVAPEDGARNGCEDNHSSVKNVSVESEKVLPLQTPKTEKCDQTVGFKEE
INSVKFVNGVAVVPGRTRTLVEKFEKREKLNSTEDQTINTPENPTLDKTPPPSL
AENEEKSDRLNIVERKATRMSSHMVTAEDRTPVTLVGSPEDQSTVMAPQRELPA
DESSKTPPLPVEDLEIHHGSNVSEDKATILSSQTVSEEDSKRSTLIRNFRRRDR
FKSTEGRSPVMATQRKLPTDESGKTSSLPMEDLEIKGGLNVSEDKATSFSSRAP
PREDRAHSALVRNVRKRDKFKSTNDTITVMVHQRGLSTDEASTVSVERVERRQL
SNNVENPLNNLPPHSVPPTTTRGEPQYVGSESDSVNHEDVTELLLGNHEVFLST
LRSRLTKLQVV
353 WD40 repeat MSTFLTGTALSNPNPNKSYEVVQPPNDSVSSLSFNPKANFLVATSWDNQVRCWE 148 1197
protein IVRSGTSLGTTPKASISHDQPVLCSTWKDDGTTVFSGGCDKQVKMWPLSGGQPM
TVAMHDAPIKEISWIPEMNLLVTGSWDKTLRYWDTRQANPVHIQQLPERCYALT
VRHPLMVVGTADRNLIIYNLQSPQTEFKRISSPLKYQTRCLAAFPDQQGFLVGS
IEGRVGVHHLDDSQQSKNFTFKCHREGSEIYSVNSLNFHPVHHTFATAGSDGAF
NFWDKDSKQRLKAMSRCSQPIPCSTFNNDGSIFAYSACYDWSKGAENHNPATAK
TYIFLHLPQESEVKGKPRLGTTGRK
354 WD40 repeat MEVEAQQRDVNNVMCQLVDPEGTTLGPPMYLPQDVGPQQLQQMVNKLLSNEDKL 140 1567
protein PYTFYISDQELVVPLESYLQKNKVSVEKVLSIVYQPQAIFRIRPVNRCSATIAG
HSEAVLSVAFSPDGKQLASGSGDTTVRLWDLSTQTPMFTCKGHKNWVLSIAWSP
DGKHLVSGSKAGEIQCWDPLTGQPSGNPLVGHKKWITGISWEPVHLSSPCRRFV
SSSKDGDARIWDVTLRRCVICLSGHTLAVTCVKWGGDGVIYTGSQDCTIKVWET
SQGKLIRELKGHGHWVNSLALSTEYVLRTGAFDHTGKQYSSAEEMKQVALERYK
KMKGNAPERLVSGSDDFTMFLWEPSVSKHPKTRMTGHQQLVNHVYFSPDGQWVA
SASFDKSVKLWNGITGKFVAAFRGHVGPVYQISWSADSRLLLSGSKDSTLKIWD
IRTKKLKRDLPGHADEVFAVDWSPDGEKVVSGGKDKVLKLWMG
355 WD40 repeat MDAGSAHSSSNMKTQSRSPLQEQFLQRRNSRENLDRFIPNRSAMDFDYAHYMLT 376 1737
protein EGRKGKENPAVSSPSREAYRKQLAETLNMNRTRILAFKNKPPTPVELIPHELTS
AQPAKPTKTRRYIPQTSERTLDAPDLLDDYYLNLLDWGSSNVLSIALGNTVYLW
NASDGSTSELVTIDDETGPVTSVSWAPDGRHIAVGLNNSDVQLWDSADNRLLRT
LRGGHRSRVGSLAWNNHILTTGGMDGLIVNNDVRVRSHIVDTYRGHTQEVCGLK
WSASGQQLASGGNDNILHIWDRSTASSNSPTQWLHRLEEHTAAVKALAWCPFQG
NLLASGGGGGDRTIKFWNTHTGACLNSVDTGSQVCALLWNKNERELLSSHGFTQ
NQLTLWKYPSMVKIAELTGHTSRVLFMAQSPDGCTVASAAGDETLRFWNVFGVP
EVAKPAPKANPEPFAHLNRIR
356 WD40 repeat MEEAIPFKNLPSREYQGHKKKVHSVAWNCTGTKLASGSVDQTARVWHIEPHGHG 69 1010
protein KVKDIELKGHTDSVDQLCWDPKHADLIATASGDKTVRLWDARSGKCSQQAELSG
ENINITYKPDGTHVAVGNRDDELTILDVRKFKPIHKRKFNYEVNEIAWNMSGEM
FFLTTGNGTVEVLAYPSLRPVDTLMAHTAGCYCIAIDPVGRYFAVGSADSLVSL
WDISEMLCVRTFTKLEWPVRTISFNHTGDYVASASEDLFIDISNVQTGRTVHQI
PCRAAMNSVEWNPKYNLLAYAGDDKNKYQADEGVFRIFGFESA
357 WD40 repeat MGKDEEEMRGEIEERLINEEYKVWKKNTPFLYDLVITHALEWPSLTVEWLPDRE 149 1423
protein EPPGKDYSVQKLVLGTHTSENEPNYLMLAQVQLPLEDAENDARHYDDDRADVGG
FGCANGKVQIIQQINHDGEVNRARYMPQNSFIIATKTVSAEVYVFDYSKHPSKP
PLDGACSPDLRLRGHSTEGYGLSWSKFKQGHLLSGSDDAQICLWDINATPKNKS
LDAMQIFKVHEGVVEDVAWHLRHEYLFGSVGDDQYLLIWDLRTPSVTKPVQSVV
AHQSEVNCLAFNPFNEWVVATGSTDKTVKLFDLRKISTALHTFDAHKEEVFQVG
WNPKNETILASCCLGRRLMVWDLSRIDEEQTPEDAEDGPPELLFIHGGHTSKIS
DFSWNTCEDWVVASVAEDNILQIWQMAENIYHDEDDVPGEESNKGS
358 WD40 repeat MMRGFSCTEDGDAPSTSSTSPPPPPPPPHRQQMQAPRASSSSSGQPTSRRSTGN 365 2677
protein VFKLLARREVSPRSKHSLKKFWGEASECQLCPFQQSYEAVRDVRRSLISWVEAF
SLQHLSAKYCPLMPPPRSTIAAAFSPDGKILASTHGDHTVKLIDSQTGSCLKVL
RGHRRTPWVVRFHPLYPEILASGSLDHEVHLWDANTAECIGSRNFYRPIASIAF
HAQGDLLAVASGHKLYIWHYNRSGETSSPTIVLRTPRSLRAVHFHPHAAPFLLT
AEVNDLDLTDSAMTLATSPGYLHYPPPTIYLADAHSNERSRLEDELPLMPSPLL
MWPSFTRDDGRATLPHIGGDVGLSGQQRVDSLSSSQYEFHPSPIEPSSSTSMHE
EMGTDPFSSVRESEVTQSAMNIVDNTEVQPEERSTYSFSFSDPRFWELPSVYGW
LVGQTQAAPRTAPSPGALETASALGEVASVSPVRSEFMPGGMDQPRLGGRSGSG
CRSSGSRMMRTAGLNDHPHDENYPQSVVSKLRSELEASLAAAASTELPCTVKLR
VWPYDMKDPCALFRSESCRLTIPHAVLCSEMGAHFSPCGRFFAACVACVLPQLE
ADPVLHGQVDPDVTGVATSPTRHPVSAYQIMYELRIYSLEEATFGMVLASRSIR
AAHCLTSIQFSPTSEHLLLAYGRRHNSLLKSIVIDGENTVPIYSILEVYRVSDM
ELVRVLPSAEDEVNVACFHPSVGGGLVYGTKEGKLRILQIDSSGGLNPKSTGFL
DENMAEVPTYALEC
359 WD40 repeat MGEGDLPRTEAGVLRGHEGAVLAARFNGDGNYCLSCGKDRTIRLWNPHRGIHIK 24 923
protein TYKSHGREVRDVHCTSDNSKLISCGGDRQIFYWDVSTGRVIRRFRGHDSEVNAV
KFNDYASVVVSAGYDRSVRAWDCRSHSTEPIQIINTFQDSVMSVCLTKTEIIGG
SVDGTVRTFDIRIGREISDDLGQPVNCISMSNDGNCILASCLDSTLRLVDRSAG
ELLQEYKGHTCKSYKLDCCLTNTDAHVAGGSEDGYVFFWDLVDASVISKFRAHS
SVVTSVSYHPKEDCMITASVDGTIKVWKT
360 WD40 repeat MACIKGVGRSASVAMAPDGGYLATGTMAGTVDLSFSSSASLEIFGLDFQSDDRD 221 3598
protein LPLIAESPSSERFNRLSWGKNGSGSDEFSLGLIAGGLVDGTIGLWNPLSLIRSE
AGDKAIVGHLSRHKGPVRGLEFNVIAPNLLASGADDGEICIWDLAAPREPSHFP
PLRGSGSAAQGEISFLSWNSKVQHILASTSYNGTTVVWDLKKQKPVISFSDSVR
RRCSVLQWNPDLATQLVVASDEDSSPTLRLWDMRNIMSPVKEFAGHTRGVIAMS
WCPNDSSYLVTCAKDNRTICWDTVTGEIVCELPAGSNWNFDVHWYPKIPGVISA
SSFDGKIGIYNVEGCSRYGVRENEFGAATLRAPKWFKRPVGASFGFGGKVVSFH
TRSTGGPSVNSSEVFVHDIITEQTLVSRSSEFEAAIQSGDRPSLRALCEKKSQH
CESTDDQETWGFLKVLLEDDGTARSKLLAHLGFDIPTETNDGSQEDLSQQVNAL
GLEDVTADKVVQEDNNESMVFPTDNGEDFFNNLPSPRADTPVSTSADGFPTVNA
AVEPSQDEVDGLEESSDPSFDDSVQRALVVGDYKAAVALCMSANKLADALVIAH
VGGASLWESTRDKYLKMSRLPYLKVVFAMVNNDLQSLVDTRPLKFWKETLAILC
SFAQGEEWAMLCNSLASKLMAAGNMLAATLCFICAGNIDKTVEIWSRSLATEHD
GMSYMDLLQDLMEKTIVLALASGQKQFSASVCKLVEKYAEILASQGLLTTAMDY
LKLLGTDDLSPELAVLRDRIAFSVEAEKGANISAFNGSQDPRGAVYGVDQSNYG
MVDTSQHYYPEAAQPQVPHTVPGSPYGENYQQPFGSSFGKGYNTPMQYQAPSQA
SMFVPSEPPQNAQPSFVPTPVTSQPTTRSQFIPAPPLALRNPEQYQQPTLGSHL
YPGSVNPTFQPLPHAPGPVAPVPPQVSSVPGQNMPQAVAPTQMRGFMPVTNPGV
VQNPGPISMQPATPIESAAAQPVVSPAAPPPTVQTADTSNVPAPQKPVIATLTR
LYNETSEALGGSRANPAKKREIEDNSRKIGALFAKLNSGDISKNAADKLVQLCQ
ALDNGDYSTALQIQVLLTTSEWDECNFWLATLKRMIKTRQNVRLS
361 WD40 repeat MKERGKGAGRSVDERYTQWKSLVPVLYDWLANHNLVWPSLSCRWGPQLEQATYK 44 1447
protein NRQRLYLSEQTDGSVPNTLVIANVEVVKPRVAAAEHISQFNEEARSPFVKKFKT
IIHPGEVNRIRELPQNSKIVATHTDSPDVLIWDVETQPNRHAVLGASTSRPDLI
LTGHKDNAEFALAMSPTEPFVLSGGKDRYVVLWSIQDHISTLAADPGSAKSPGS
AGTNNKQSSKAAGGNDKTGDSPSIEPRGVYLGHGDTVEDVTFCPSSAQEFCSVG
DDSCLILWDARTGSSPAIKVEKAHHADLHCVDWNPHDVNLILTGSADNTVRMFD
RRNLTSGGVGSPVHTFEGHNAAVLCVQWSPDKSSVFGSSAEDGILNIWDHEKIG
RKIETVGSKVPNSPPGLFFRHAGHRDKVVDFHWNSSDPWTIVSVSDDGESTGGG
GTLQIWRMIDLIYRPEEEVLAELDKFKSHILSCTS
362 WD40 repeat MAKIAPGCEPVAGTLTPSKKREYRVTNRLQEGKRPLYAVVFNFIDSRYFNVFAT 196 1314
protein VGGNRVTVYQCLEGGVIAVLQSYIDEDKDESFYTVSWACNIDRTPFVVAGGING
IIRVIDAGNEKIHRSFVGHGDSINEIRTQPLNPSLIVSASKDESVRLWNVHTGI
CILIFAGAGGHRNEVLSVDFHPSDKYRIASCGMDNTVKIWSMKEFWTYVEKSFT
WTDLPSKFPTKYVQFPVFIAPVHSNYVDCNRWLGDFVLSKSVDNEIVLWEPKMK
EQSPGEGSVDILQKYPVPECDIWFIKFSCDFHYHSIAIGNREGKIYVWELQSSP
PVLIAKLSHPQSKSPIRQTAMSFDGSTILSCCEDGTIWRWDAITASTS
363 WD40 repeat MNTAMHFGAGWRSIAEMGYTMSRLEIEPESCEDEKSLDGVGNSQGPNELPRCLD 193 1668
protein HELAHLTNLKSRPHEHLIRDFPGRRALPVSTVKMLAGRECNYSRRGRFSSADCC
HMLSRYVPVNGPSPLDQMNSRAYVSQFSADGSLFVAGFQGSHIRIYNVDKGWKC
QKNILTKSLRWTITDTSLSPDQRYLVYASMSPIVHIVDIGSAAMDSLANITEIH
EGLDFSADSGPYSFGIFSVKFSTDGREVVAGSSDDSIYVYDLVANKLSLRIPAH
ESDVNTVCFADESGHIIYSGSDDTYCKVWDRRCLSARNKPAGVLMGHLEGITFI
DSRGDGRYFISNGKDQTIKLWDIRKMGSDICRRGFRNFEWDYRWMDYPPRARDS
KHPFDLSVATYKGHSVLRTLIRCYFSPVHSTGQKYIYTGSHDSCVYIYDVVTGA
QVAALKHHKSPVRDCSWHPEYPMIVSSSWDGDIVKWEFFGNGETEIPAMIKKRIR
RRHLY
364 WD40 repeat MEPQPQAPKKRGRKPKPKEDKKEEQLHQPPPPPPPQQQAAPAPAPAATRSSTSG 78 1634
protein SAGGRDRRPQQQHAVDEKYARWKSLVPVLYDWLANHNLLWPSLSCRWGPQLEQA
TYKNRQRLYISEQTDGSVPNTLVIANCEVVKPRVAAAEHVSQFNEEARSPFIRK
YKTIIHPGEVNRVRELPQNPNIVATHTDSPDVLIWDVESQPNRHAVYGATASRP
NLILTGHQENAEFALAMCPAEPFVLSGGKDKTVVLWSIQDHITASATDQTTNKS
PGSGGSIIKKTGEGNEETGNGPSVGPRGIYCGHEDTVEDVAFCPSTAQEFCSVG
DDSCLILWDARVGTNPVAKVEKAHNGDLHCVDWNPHDNNLILTGSADNSVNMFD
RRNLTSNGVGSPVYKFEGHKAAVLCVQWSPDKPSVFGSSAEDGLLNIWDYERVD
KKVDRAPNAPAGLFFQHAGHRDKIVDFHWNAADPWTMVSVSDDCDTAGGGGTLQ
IWRMSDLIYRPEEEVLAELENFKAHVLECSKA
365 WD40 repeat MGIFEPYRAVGYITTGVPFSVQRLGTETFVTVSVGKAFQVYNCAKLSLVLVGPQ 85 2826
protein LPKKIRALASYREYTFAAYGSDIGIFKRAHQLATWSGHTAKVCLLLLFGEHILS
VDVDGNAYIWAFKGMNYNLSPVGHILLDSNFTPSCIMHPDTYLNKVILGSQEGP
LQLWNISTKTKLYEFKGWNSSVSSCVSSPALDVVAVGCADGKIHVHNIRYDEEL
VTFSHSMRGSVTALSFSTDGQPLLASGSSSGVVSIWNLDKRRLQSVIRDAHDGS
IISLHFFANEPVLMSSSADNSIKMWIFDTSDGDPRLLRFRSGHSAPPLCIRFYA
NGRHILSAGQDRAFRLFSVVQDQQSRELSQRHVSKRAKKLKLKEEEIKLKPVIA
FDVAEIRERDWCNVVTSHMDTPQAYVWRLQNFVIGEHILRPCPNKPTPVKACMI
SACGNFAILGTAGGWIERFNLQSGISRGSYIDQLEGTNSAHDGEVVGVACDATN
TLMISAGYAGDIKVWDFKGRELKSRWEIGSSLVKISYHRLNGLLATVADDFIIR
LFDAVALRMVRKFEGHTDRITDLCFSEDGKWLLSSSMDGSLRIWDIILARQVDA
VFVDVSITALSLSPNMDILATTHVDQNGVFLWVNQSMFSGDSDINLYASGKEVV
TVKLPSVSSVEGSQVEESNEPTIRHSESKDVPSFRPSLEQIPDLVTLSLLPKSQ
WQSLINLDIIKVRNKPVEPPKKPEKAPFFLPSIPSLSGEILFKPSEMSDKGDMK
ADEDKSKITPEVPSSRFLQLLHSCSEAKNFSPFTTYIKGLSPSTLDLELRMLQI
IDDDAVDADADDPQDVDKRQELLSIELLMDYFIHEISCRSNFEFVQALVRLFLK
IHGETIRRQSVLQNKAKVLLETQCSVWQRVDKLFQGARCMVAFLSNSQF
366 WD40 repeat MEETKVTCGSWIRRPENVNLAVLGRSPRRRGSAALEIFAFDPKSTSLSSSPLVA 74 1246
protein HVIEEIEGDPLAIAVHPNGEDIVCFASSGSCLSFELSGQESNLKLLTKELPPLR
GIGPQKCMAFSVDGSRFATGGVDGRLRILEWPSLRIILDEPKAHKSIRDLDFSL
DSEFLATTSTDGSARIWKAEDGLPCTTLTRRSDEKIELCRFSKDGTKPFLFCTV
QRGDKAVTGVWDISTWNKIGHKRLLRKPAVVMSISLDGKYLAQGSKDGDMCVVE
VKKMEVSHWSKRLHLGTSLTSLEFCPIERVVITTSDEWGVLVTKLNVPADWKAW
QVYLLLLGLFLASLVAFYIFYENSDSFWGFPLGKDQPARPKIGSVLGDPKSADD
QNMWGEFGPLDM
367 WD40 repeat MADPVEHQHQQHQQHQLQQQRRRGWRIQGGQYLGEISALCFLHLPPPPLSLSSS 100 4377
protein PVLSLSSGLDSESRDRPACSFRFPSAGSGSQVSLFDLASGAMVRTFYVFRGIRV
HGIVLGCADFPGGSSSSSSTLDYVIAVYGERRVKLFRLSVRLGRGAGEGSGTVL
SADLELVSAAPRLSHWVMDVRFLKENGTSEDELQRCLTVAIGCSDNSIRLWDVD
KCSFVLAVSSPERCLLYSMRLWGDNLEDLQVASGTIYNEILIWKVVPNHDAPSS
NELTEEGLTNSCAGNSVHECLRYEAYHICRLVGHEGSIFRIAWSSDGSKLVSVS
DDRSARIWEVHCKVQYSEDAGEVGLLFGHSARVWDCYISDNLIVTAGEDCSCRV
WGLDGQQHDVIKEHIGRGIWRCLYDPWSSLLVTGGFDSAIKVHKLDASLAEASA
KQSNIKDLSDGTELFTTHLPNSSGHSGHMDSKSEYVRCLSFSCEDVMYIATNHG
YLYHAKLCNDGDLRWTELAQVSNEVQIICMELLPSNPYDPRIDADDWVAVGDGK
GWTTVVRVVKNSDSPKVSTSFSWAAEMDRQLLGIHWCKSLGHRFIFTADPRGAL
KLWRFFEVSQSSSLYPENSPRISLIAEFKSDLGARIMCLDVAFESELLICGDLR
GNLVLFPLLKDLLLDTFVVSAAKISPVNHFKGAHGISAVSSISVAHMSFNHIEL
RSTGADGCICYMEYDKGLQSLNFVGMKQVKELSMIESVSTENESTGYRTSGSYA
SGFASTDFIIWNLVTEAKVLQVSCGGWRRPHSYYLGDVPEMKNCFAYVKDDIIY
IRRHWIKDSKDKILPQNLRLQFHGREVHSLCFVTGDFQLRKNKQSSWIVTGCED
GTVRLTRYTQCTDNWSSSKLLGEHVGGSAVRSICCVSNIHTTSSGTSVSDVKGI
ENLPKDIKGTLMEDECNPSLLISVGAKRVLTSWLLRRRKQDGKEDDVTDLQEAE
NSSLPSSAGSSTFSFQWLSTDMPVKYSVPSKKSGSIKKLIGVSDTNVRCKSLLP
DSEALQSKVSAVDKNEDDWRYLAVTAFLVRHSGSRLIVCFIIVACSDATLAIRA
LVLPYRLWFDVALMVPLSSPVLSLQHVIIGRCQLPDENVQIGNVYVVISGATDG
SIAFWDLTESVEAFMRRLSNIHLEKFMDCQKRPRTGRGSQGGRWWRSLSKIACK
EQPINDPVTAKAIKELNRKLTGGVACGSSSSMLDASPELDSNAANSSFEIIEVN
PFHVLNGVHQSGVNCLHVCETKHGQSSDGRFLYQLVSGGDDQALHLLKFEVLVQ
PPVQVPDVPNSDIRNSILVEEFLLDEQNQKTKCTIEFISQEKIASAHNSAVKGV
WTDGTWVFSTGLDQRVRCWISKDRGTPTELAHFIISVPEPEALDARSICWDQYQ
IAVAGRGMQMIEFHVPSSEIR
368 WD40 repeat MPYKLSATLSNHSSDVRAVASPSDDLILSASRDSTAISWFRQSPSSFTPASVIR 58 2439
protein AGSRFVNAIAYLPPTPRAPQGYAVVGGQDTVVNVFALGPGDKEEPEYTLVGHTD
NVCALSVNSDDTIISGSWDKTAKVWKDFALVYDLKGHQQSVWAVLAMNEKEFLT
ASADRTIKYWVQHKTMQTYEGHRDAVRGLALIPDIGFASCSNDSEIRVWTMGGD
VVYTLSGHTSFVYSLSVLPNGDLVSAGEDRSVRVWRDGECSQVIVHPAISVWAV
STMPNGDIISGSSDGVVRVFSESEKRWATASELKALEDQIASQSLPSQQVGDVK
KTDLPGPEALSVPGKKAGEVKMIRSGDVVEAHQWDSLASSWQKIGEVVDAIGSG
RKQLHDGKEYDYVFDVDIQEGAPPLKLPYNVSENPYTAAQRFLEQNDLPTGYLD
QVVKFIEQNTAGVKLGNDGYVDPFTGASRYQPATQSTSNTASSSYMDPFTGGSR
HIAESAPSNVPQGSHATGIIPFSKPIFFKLANVSAMQAKMFQFDEVLRNEISTA
TLAMRPDEVIMVNETFTYLSKVVTSTSSARTSLGWIHIETIMQILDRWPVPQRF
PVIDLGRLVTAYCMNAFSGPGDLEKFFSCLFRTSEWTSITSGSKALTKAQETNV
LLLFRTIANSLDGAPLNDMEWIKQIFRELAQTPQLVLNKSHRLALASVLFNFSC
IGLKGPVPADVRTLHLTIILQVLRSPNDDPEVAYRTCVALGNMLYSDKTRGTPR
DAQSPSPTELKSAVAAIKGGFSDPRINDVHREIMSLI
369 WD40 repeat MPPQKIESGHKDTVHDLAMDYYGKRLATASSDHTINVVGVSSSGSQHLATLIGH 159 1064
protein QGPVWQISWAHPKFGSLLASCSYDGRVIIWREGNPNEWTQAQVFEEHKSSVNSV
AWAPHELGLCLACGSSDGNISVFTARQDGGWDTSRIDQAHPVGVTSVSWAPSTA
PGALVGSGMMEPVQKLCSGGCDNTVKVWKLYNRVWKLDCFPVLQMHTDWVRDVA
WAPNLGLPKSTIASASQDGRVIIWTLAKEGDQWQGKVLYDFRTPVWRVSWSLTG
NILAVADGNNNVSLWNEAVDGEWIQVSTVEP
370 WD40 repeat MSAPMLEIEARDVVKIVLQFCKENSLHQTFQTLQSECQVSLNTVDSIETFVADI 118 1665
protein NSGRWDAILPQVAQLKLPRNTLEDLYEQIVLEMIELRELDTARAILRQTQAMGV
MKQEQPERYLRLEHLLVRTYFDPNEAYQDSTKEKRRAQIAQALAAEVTVVPPSR
LMALVGQALKWQQHQGLLPPGTQFDLFRGTAAMKQDVDDMYPTTLSHTIKFGTK
SHAECARFSPDGQFLVSCSVDGFIEVWDYMSGKLKKDLQYQADETFMMHDDPVL
CVDFSRDSEMLASGSQDGKIKVWRIRTGQCLRRLERAHSQGVTSVLFSRDGSQL
LSTSFDGSARIHGLKSGKQLKEFRGHSSYVNDAIFSNDGSRVITASSDCTVKVW
DVKTSDCLQTFKPPPPLRGGDASVNSVHLFPKNADHIVVCNKTSSIYIMTLQGQ
VVKSLSSGKREGGDFVAACVSPKGEWIYCVGEDRNLYCFSCQSGKLEHLMKVHE
KDVIGVTHHPHRNLVATYSEDSTMKLWKP
371 WD40 repeat MDLLQSYAEDNDGDLGRHSSPEPSPPRLLPSKSAAPKVDDTTLALTVAQTNQTL 57 16828
protein ARPIDPSQHAVAFNPTYDQLWAPICGPAHPYAKDGIAQGMRNHKLGFVEDAAIG
SFLFDEQYNTFQRYGYAADPCASTGNEYVGDLDALKQNDGISVYNIRQQEQKKY
AEEYAKKKGEERGEGGREKAEVVSDKSTFHGKEERDYQGRSWIAPPKDAKATND
HCYIPKRLVHTWSGHTKGVSAIRFFPKHGHLILSAGMDTKVKIWDVFNSGKCMR
TYMGHSKAVRDISFCNDGTKFLTAGYDKNIKYWDTETGKVISTFSTGKIPYVVK
LHPDDEKQNILLAGMSDKKIVQWDMNTGQITQEYDQHLGAVNTITFVDDNRRFV
TSSDDKSLRVWEFGIPVVIKYISEPHMHSMPSISLHPNTNWLAAQSLDNQILIY
STRERFQLNKKKRFAGHIVAGYACQVNFSPDGRFVMSGDGEGRCWFWDWKSCKV
FRTLKCHEGVCIGCEWHPLEQSKVATCGWDGLIKYWD
372 WD40 repeat MESNGNLEQTLQDGRIYRQLNSLIVAHLRDHNFPQAASAVALATMTPLNVEAPR 250 1566
protein NRLLELVAKGLAVEKGELLRGVSHAGTNDLGGSIPASYGLVPAPWTAIDFSSLR
DTKGMSKSFTKHETRHLSDHKNVARCARFSTDGRFFATGSADTSIKLFEVSKIK
QMMLPDSTDGAIRAVIRTFYDHTHPVNDLDFHPQNTVLISAAKDHTVKFFDYSK
ATAKRAFRVIQDTHNVRSVAFHPSGDFLLAGTDHPIPHLYDVNTFQCYLSANVP
EFAVNAAINQVRYSSSGGMYVTASKDGTIRFWDGASANCVRSIAGAHGAAEVTS
ANFTKDQRYVLSCGKDSTVKLWEVGTGRLVKQYLGATHMQLRCQAVFNNTEEFV
LSIDEPSNEIVVWDAMTAEKVARWPSNHNGPPRWIEHSPTEAAFVSCGTDRSIR
FWKETH
373 WD40 repeat MSNFQGEDGEYVADDFEAEDGDEELHGRESADPESDVDEIDTPSNRFTDTTADQ 106 1434
protein ARRGRDIQGIPWERLSITREKYRRTRLEQYKNYENVPQSGEKSGKDCTVTEKGN
SFYEFRRNSRSVKSTILHFQLRNLVWATSKHDVYLMSNYSVVHWSSLTGKKSEV
LNLAGHVAPNEKHPGSLLEGFTQTQVSTLAVKDRFLVAGGFQGELICKFLDRPG
ISFCSRTTYDDNAITNAVEIYVSPSGGIHFIASNNDCGVRDFDMENFELSKHFR
FPWPVNHTSLSPDGKLLVIVGDDPEGILVDAKTGKTIMPLRGHLDFSFASEWHP
DGVTFATGNQDKTCRIWDIRNLSKSIAVLKGNLGAIRSIRYTSDGRYMAIAEPA
DFVHVYDTKTGYKKEQEIDFFGEISGMSFSPDTESLFIGVWDRTYGSLLEYGRR
RNFSYLDCLV
374 WD40 repeat MGVEEDLEDLNALAESTDAAVDGQAALASAVDSVTLQPAPPILPPVIPPPAVPV 190 1917
protein VAPVPTIPPVLRPLAPLPIRPPVLRPPAPKRDEAGSSDSDSDHDGTAAGSTAEY
EITEESRLVRERHEKAMQDLMMKRRGAALAVPTNDKAVRARLRRLGEPMTLFGE
REMERRDRLRMLMAKLDAEGQLEKLMKAHEDEEAAASAAPEDVEEEMLQYPFYT
EGSKALFNARIDIAKFSITRAALRLERARRRRDDPDEDVDAEIDWALKKAESLS
LHCSEIGDDRPLSGCSFSHDGKLLATCSMSGVAKLWDTCRMPQVNRVLTLKGHT
ERATDVAFSPVQNHIATASADRTAKLWNTEGTILKTFEGHLDRLGRIAFHPSGK
YLGTTSFDKTWRLWDIESGEELLLQEGHSRSIYGIDFHRDGSLVASCGLDALAR
VWDLRTGRSILALEGHVKPVLGVSFSPNGYHLATGGEDNTCRIWDLRKKKSLYT
IPAHANLISEVKFEPQEGYFLVTASYDTTAKVWSARDFKPVKTLSVHEAKITSV
DITADASHIVTVSHDRTIKLWTSNDDVKEQAMDVD
375 WD40 repeat MVKAYLRYEPAAAFGVIASVESNIAYDASGKHLLAPALEKVGVWHVRQGVCTKA 102 2942
protein LAPSASSAAGPSLAVTAIASSPSSLIASGYADGSIRIWDFEKGSCETTLNGHKG
AVSVLRYGKLGSLLASGSKDNDIILWDVVGETGLYRLRGHRDQVTDLVFLDSDK
KLVSSSKDKYLRVWDLETQHCMQIVGGHHSEIWSLDTDPEERYLVTGSADPELR
FYTVKNDSSDERSEADASGGVGNGDLASHNKWDVLKQFGEIQRQSKDRVATVRF
NKNGNLLACQAAGKLVEVFRVLDEAEAKRKAKRRLHRKREKKGADVNENGDSSR
GIGEGHDTMVTVADVFKLLQTIRASKKICSISFCPVAPKSSLATLALSLNNNLL
EFHSIEADKTSKMLTIELQGHRSDVRSVTLSSDNTLLMSTSHNSVKIWNPSTGS
CLRTIDSGYGLCGLIVPQNKHALIGTKDGAIEIFDVGSGTCIEVVEAHGGSIRS
IVAIPNQNGFVTGSADHDIKFWEYGMKQKPGDNSKHLTVSNVRTLKMNDDVLVV
AVSPDAQKIAVALLDCTVKVFFMDSLKLMHSLYGHRLPVLCLDISSDGDLIVTG
SADKNLMIWGLDFGDRHKSIFAHGDSIMAVQFVGNTHYMFSVGKDRLVKYWDAD
KFELLLTLEGHHADIWCLAISNRGDFLVTGSHDRSIRRWDRTEEPFFIEEEKEK
RLEEMFESDLDNAFGNKYVPKEEIPEEGAVALAGKKTQETLSATDSIIEALDIA
EVELKRIAEHEEEKNNGKTAEFHPNYVMLGLSPSDFILRALSNVQTNDLEQTLL
ALPFSDALKLLSYLKDWTTYPDKVELVSRIATVLLQTHYNQLVSTPAARPLLTT
LKDILHKKVKECKDTIGFNLAAMDHLKQLMALRSDALFQDAKVKLLEIRSQLSK
RLEERTDPREAKRRKKKQKKSTNMHAWP
376 WD40 repeat MGGVQAEREDKDKVSLELTEEILQSMEVGMTFRDYSGRISSMDFHRASSYLVTA 75 1079
protein SDDESIRLYDVASATCLKTINSKKYGVDLVSFTSHPMTVIYSSKNGWDESLRLL
SLHDNKYLRYFKGHHDRVVSLSLCPRNECFISGSLDRTVLLWDQRAEKCQGLLR
VQGRPATAYDDPGLVFAIAFGGCVRMFDARKYEKGPFEIFSVGGDVSDANVVKF
SNDGRLMLLTTTDGHIHVLDSFRGTLLYTFNVKPTSSKSTLEASFSPEGMFVIS
GSGDGSVYAWSVRGGKEVASWLSTDTEPPVIKWAPGNLMFATGSSELSFWIPDL
SKLGAYVGRK
377 WD40 repeat MAAFGAAPAGNHNPNKSSEVIQPPSDSVSSLCFSPRANHLVATSWDNQVRCWEL 99 1148
protein TKNGASVTSVPKASMSHDQPVLCSAWKDDGTTVFSGGCDKQAKMWSLMSGGQPV
TVAMHDAPIKEIAWIPEMNVLVTGSWDKTLKYWDTRQSNPVHTQQLPERCYAMT
VRYPLMVVGTADRNLIVFNLQNPQAEFKRFSSPLKYQTRCVAAFPDQQGFLVGS
IEGRVGVHHLDDSQISKNFTFKCHRDNNDIYSVNSLNFHPVHHTFATAGSDGTF
NFWDKDSKQRLKAMSRCSQPIPCSTFNNDGTIYAYSVCYDWSKGAENHNPATAK
TYIFLHLPQESEVKAKPRVGTTNRK
378 WD40 repeat MNCSISGEVPEEPVVSTKSGHVFERRLIERYVSDYGKCPVSGEPLTMDDVLPVK 232 1806
protein MGKIVKPRPLQAASIPGLLSIFQNEWDSLMLSNFALEQQLHTARQELSHALYQH
DAACRVIARLKKERDEARSLLALAERQIPMTASSDIAVNAPAMSNGRKASLDEE
PGYAGKKMRPGISASIIAEITDCNLALSQQRKKRQIPSTLAPVEDLERYTQLSS
YPLHKTGKPGITSLDICHSKDIIATGGIDTSAVLFDRSSGQIMSTLSGHSKKVT
SVNFDAQGDMVLTGSADKTVRIWQGSEDGSYNCRHILKDHTAEVQAITVHATNN
YFATASLDNTWCFYEFSTGLCLTQVEGASGSEGYTSAAFHPDGLILGTGTSNAD
VKIWDVKTQANVTTFSGHTGAITAISFSENGYFLATAAQDGVKLWDLRKLKNFR
TFSAYDKDTGTNSVEFDHSGCYLGLAGSDIRVYQVASVKSEWNCVKTFPDLSGT
GKVTCVKFGPDSKYIAVGSMDHNLRIFGLPSEDGAMES
379 WD40 repeat MAAPGVETLKKEIKELKEKIAQHRLDTDGEQPLPAAAKSKSVPEVSAALKQRRI 72 1124
protein LKGHFGKIYALHWSADSRHLVSASQDGKLIIWNGFTTNKVHAIPLRSSWVMTCA
YSPSGNLVACGGLDNLCSVYKVPHGGNKESSSAQKTYGELAQHEGYLSCCRFIK
DNEIVTSSGDSTCILWDVETKTPKAIFNDHTGDVMSLAVFDDKGVFVSGSCDAT
AKLWDHRVHKQCVMTFQGHESDINSVQFFPDGDAFGTGSDDSSCRLFDIRAYQQ
INKYSSDKILCGITSVAFSKTGKSLFAGYDDYNTYVWDTLSGNQVEVLTGHENR
VSCLGVSEDGKALATGSWDTLLKIWA
380 WD40 repeat MGGVEDESEPASKRMKLSSRVLRGLANGSSRTEPAAGSSLDLMARPLPIEGDEE 315 2069
protein VIGSKGVIKRVEFVRLIAKALYSLGYEKSGARLEEESGIPLQSSVVNLFMQQIS
DGLWDESVVTLHKIGLSDENLVKSASFLILEQKFLELLDQEKAMDALKTLRTEI
TPLCIKNSRVRELSSCIISPSSCGLLNQNKRNSTRARSRSELLEELQKLLPPAV
IIPERRLEHLVEQALVLQTDACMLHNSIDMEMSLYTDHQCGKEHIPCRTLQILQ
SHNDEVWLVQFSHNGKYLASASNDRSAIIWEVDENGSVSLKHKLTGHQKPISSV
CWSPDDRQLLTCGVGETVRRWDVSSGECLRVYEKAGHGLISCAWFPDGKWICYG
VSDRSICMCDLEGKEIECWKGQRTLSISDLEITSDGKQIISICRETAILLLDRE
AKYERMIEENQTITSFSLSKDNRYLLVNLLNQEIHLWDIKGDFRLVAKYKGLKR
SRFVIRSCFGGLKQAFVASGSEDSQVYIWHKGSGELIEPLPGHSGAVNCVSWNP
ANHHMLASASDDRTIRIWGLNELNTRHKGARPNGVHYCNGNGTS
381 WD40 repeat MTQLAETYACMPSTERGRGILIAGNPKPGSNSVLYTNGRSVVILNLDNPLDISV 145 1968
protein YAEHAYPATVARFSPNGEWVASADSSGAVRIWGAYNDHVLKKEFKVLSGRIDDL
QWSPDGLRIVASGDGKGKSLVRAFMWDSGTNVGEFDGHSRRVLSCAFKPTRPFR
IVTCGEDFLVNFYEGPPFKFKLSRRDHSNFVNCLRFSPDGNRFISVSSDKKGII
YDGKTGEKIGELSSDGGHTGSIYAVSWSPDSKQVITVSADKSAKIWDISEDGSG
NLRKTLTSSGSGGVDDMLVGCLWQNNHLVTVSLGGTISIYTAGDLDKAPVSFSG
HMKNVSSLSVLKGDPKVILSSSYDGLIIKWIQGIGFSGRVQRKESTQIKCLAAV
DEEIVTSGYDNKVCRVSGSGDAEFIDIGCQPKDLSLALQCPEFALVSTDTGVVL
LRGAKIVSTINLGFAVTASTVAPDGTEAIIGAQDGKLRIYSISGDTLTEEAVLE
KHRGAISVIHYSPDLSMFASGDLNREAVVWDRASREVRLKNILYHTARINCLAW
SPDSSTVATGSLDTCVIIYEVDKPASNRLTIKGAHLGGVYGLAFTDDFSVVSSG
EDACIRVWKINRQ
382 WD40 repeat MKVKVISRSTDEFTRERSQDLQRVFRNFDPNLRTQEKAVEYVRALNAAKLDKVF 130 1488
protein ARPFVGAMDGHVDSVSCMAKNPNYLKGIFSGSMDGDIRLWDIASRRTVCQFPGH
QGPVRGLAASTDGQILVSCGIDSTVRLWNVPVATLGESDGTHENLAKPLAVYVW
KNAFWAVDHQWDGELFATAGAQVDIWNQNRSQPISSFEWGTDTVISVRFNPGEP
NVLATSGSDRSITLYDLRMSSPTRKVIMRTKTNAISWNPMEPMNFTAANEDCNC
YSYDARKLEEAKCVHKDHVSAVMDIDYSPTGREFVTGSYDRTVRIFQYNGGHSR
EVYHTKRMQRVFCVKFSCDASYVISGSDDTNLRLWKAKASEQLGVVLPRERRKH
EYHEAVKSRYKHLPEVKRIVRHRHLPKPIYKAGILRRTVNEADRRKEERRKAHS
APGSSSAEPLRKRRIIKEIE
383 WD40 repeat MVRSIKNPKKAKRKNKGSKNGDGSSSSSSIPSMPTKVWQPGVDKLEEGEELQCD 269 1693
protein PSAYNSLHAFHIGWPCLSFDIVRDTLGLVRTEFPHQVYFVAGTQAEKPTWNSIG
IFKVSNITGKRRELVPSKPTDDADEESDSSDSDEDSDDEVGGSGTPILQLRKVG
HEGCVNRIRAMNQNPHICASWGDSGHVQIWDFSSHLNALAESEADVSQGASSVF
NQAPLVKFGGHKDEGYALDWSPLVPGRLVSGDCKNSIHLWEPTSGSTWNVDSTP
FIGHAASVEDLQWSPTEENVFASCSVDGTIAIWDTRLGKTPAASFKAHDADVNV
ISWNRLATCMLASGCDDGTFSIHDLRLLKEGDSVVAHFEYHKHPVTSIEWSPHE
ASTLAVSSADCQLTIWDLSLEKDEEEEAEFKAKTKEQVNAPEDLPPQLLFVHQG
QKDLKELHWHAQIPGMIVSTAADGFNILMPSNIQSTLPSDGA
384 CDK type A MERYKVIKELGDGTYGSVWKALNQQTHEIVAIKKMKRKYYIWEECINLREVKSL 1163 2545
RKLNHPNIIKLKEVIRENNELFFIFEYMECNLYQIMKERSTPFSETAIIKFCYQ
ILQGLSYMHRNGYFHRDLKPENLLVTSDLIKIADFGLAREVLTSPPYTDYVSTR
WYRAPEVLLQSPTYTTAIDMWAVGAILAELFTLHPLFPGESELDEIYKICGVLG
TPDYETWPDGMQLAAFRNFIFPQFLPVNLSVLIPHASPEAIDLITRLCSWDPQK
RPTAEQALHHPFFRIGMSIPLSLGGHFQDNTCAAEVDTNFHSKKACKGRGMGEK
ESSLECFLGLSLGLKPSLGHLGAMGSQGVGAVKQEVGSSPGCQSNPKQSLFQVL
NSRAILPLFSSSPNLNVVPVKSSLPSAYTVNSQVMWPTIAGPPAAAVTVSTLQP
SILGDFKIFGKSMGLASQYAGKEASPFS
385 CDK type A MGEMGRGINNSSNNNNSNRPAWLQHYDLVGKIGEGTYGLVFLARSKLPNNRGLR 152 1582
IAIKKFKQSKDGDGVSPTAIREIMLLREFSHENVVKLVNVHINHVDMSLYLAFD
YAEHDLYEIIRHHREKLNHHNINQYTVKSLLWQLLNGLNYLHSNWIVHRDLKPS
NILVMGEGEEHGVVKIADFGLARIYQAPLKPLSDNGVVVTIWYRAPELLLGAKH
YTSAVDMWAVGCIFAELITLKPLFQGVEVKASPNPFQLDQLDKIFKVLGHPTIE
KWPTLMNLPHWSKNLQQIQQHKYDNAGLHIGPIPAKSPAYDLLSKMLEYDPRKR
ITAAQALEHEYFRIDPQPGRNALVPSQPGEKAINYPPRLVDANTDFDGTIAPQP
SQVSSGNAPSGSIASAAVPAVRPLPQQMQLMGMQRMQNPGMAAFNLGAQASMSG
LNHNNIALQRGSSQQQAHQQVRRKEPNSGFPNTGYPPPPKSRRL
386 CDK type B-1 MDKYEKLEKVGEGTYGKVYKARDKMTGQLVALKKTRLEMDEEGVPPSSLREISL 389 1297
LQMLSQSIYVVRLLCVEHVTKKGKPLLYLVFEYLDTDLKKFIDYRRSVNAGPLP
QNVIQSFMYQLLKGVAHCHSHGVLHRDLKPQNLLVDKSKGLLKVGDLGLGRAFT
VPLKCYTHEVVTLWYRAPEVLLGSTHYSTPVDIWSVGCIFAEMVRRQPLFPGDC
EIQQLLHIFTLLGTPTEEMWPGVKRLRDWHEYPQWKPENLARAVPNLSPTGLDL
ISKMLQCDPAKRISAKAAMNHPYFDDLDKSQF;
387 CDK type B-1 MDGYEKMDKVGEGTYGKVYMARDKKTGQLVALKKTRLENDGEGIPPTALREISL 38 946
LQMLSQDIYIVRLLDVKHTENKLGKPLLYLVFEYMESDLKKYIDSYRRSHTKMP
PSMIKSFMYQLCRGVAYCHSRGVMHRDLKPHNLLVDKEKGVLKIADLGLSRAFT
VPVKKYTHEIVTLWYRAPEVLLGATHYSLPVDIWSVGCIFAEMSRMQALFTGDS
EVQQLMNIFRFLGTPNEEVWPGVTKLKDWHIYPEWKPQDISHAVPDLEPSGLDL
LSQMLVYEPSKRISAKKALEHPYFDDLDKSQF
388 CDK type B-1 MDAYEKLEKVGEGTYGKVYKAKDKNTGQLVALKKTRLESDDEGIPPTALREISL 180 1088
LQMLSQDIHIVRLLDVEHTENKNGKPLLYLVFEYMDSDLKKYIDGYRRSHTKVP
PNIIKSFMYQLCQGVAYCHSRGVMHRDLKPHNLLVDKQRGVVKIADLGLGRAFT
IPIKKYTHEIVTLWYRAPEVLLGATHYSTPVDIWSVGCIFAEMVRLQALFIGDS
EVQQLFKIFSFLGTPNEEIWPGVTKFRDWHIYPQWKPQDISSAVPDLEPSGVDL
LSKMLVYEPSKRISAKKALEHPYFDDLDKSQF
389 CDK type B-1 MDSYEKLEKVGEGTYGKVYKAKDKKTGKLVALKKTRLENDGEGIPPTALREISL 40 948
LQMLSQDMNIVRLLDVEHTENKNGKPLLYLVFEYMDSDLKKYVDGYRRSHTKMP
PKIIKSFMYQLCQGVAYCHSRGVMHRDLKPHNLLVDKQRGVLKIADLGLGRAFT
VPIKKYTHEIVTLWYRAPEVLLGATHYSTPVDIWSVGCIFAEMSRMHALFCGDS
EVQQLMSIFKFLGTPNEGVWPGVTKLKDWHIYPEWRPQDLSRAVPDLEPSGVDL
LTKMLVYEPSKRISAKKALQHPYFDDLDKSQF
390 CDK type B-1 MEKYEKLEKVGEGTYGKVYKGRDKRTGRLVALKKTPFHQEEGIPPTAIREISLL 299 1134
KSLSQCIYIVKLLDVKASFNGKGKHVLFMVFEYADSDLKKHIDAHRQCNTKLSP
RSIQSYMFQLCKGIAYCHSHGVLHRDLKPQNILVDQKIGLLKIADLGLGRACTV
PIKSYTFEVVTLWYRAPEVLLGAKRYSMALDIWSLGCIFAELCNLQALFAGDSQ
IQQLINIFRLLGTPNEQLWPGVTQLSDWHEFPQWRPQDLSKVVFNLDPNGVDLL
SKMLQYDPAKRISAKEALDHPYFDSLDKSQF
391 CDK type C MGCVCGKPSARAADYVESPAEKGASSNSRSSSMASRRLVAPAVMDQGIDAENGH 105 2642
EGDYRTKLRGKQSNGADPVSLLSDDAEKQRHSRHHQHQQHHPIRPHHLRPQGEF
VPNANSNPRFGNPPRHIEGEQVAAGWPAWLTAVAGEAIKGWIPRRADSFEKLDK
IGQGTYSNVYKARDLDTGKIVALKKVRFDNLEPESVRFMAREIQVLRRLDHPNV
VKLEGLVTSRMSCSLYLVFEYMDHDLAGLAACPGIKFTEPQVKCYMQQLLRGLD
HCHSRGVLHRDIKGSNLLIDNGGILKIADFGLATFFHPDQRQPLTSRVVTLWYR
PPELLLGATEYGVAVDLWSTGCILAELLAGKPIMPGRTEVEQLHKIFKLCGSPS
EDYWKKSKLPHATIFKPQQPYKRCVAETFKDFPPSALALMEVLLAIEPADRGTA
TSALKSDFFTTKPLACDPSSLPKYPPSKEFDAKIRDEEARRQRAAGGRGRDAAR
RPSRESRAIPAPEANAELAISIQKRRLSSQGPSKSKSEKFNPQQEDGAVGFPIE
PPRPMHIGIDAGATSRMYSQQFGPSHSGPLSNQISSSIWGKNQKEDEIQMAPGR
PSRSSKATISDFRKPGACAPQPGADLSHLSSLVATARSNAGIDTHKDRSGMWQH
NRIDAIDGVHNNGKHEFLEVPEHPNRQDWTRFQQPESFKGLDNYHLQDLPATHH
RKDERVASKEATMNWQGYGGQGGDKIHYSGPLLPPSGNIDEILKEHERHIQHAV
RRARQDKGRPQRSNLSQNERKAFEHRSFVSGVNGNAGYSDLVNELPISVGSNRL
KVSKTRGTEEIVELRELEREPLSSVMEKYEREHEM
392 CDK type C MGCVCAKQSDILGEPESPKVKGSNLASSRWSVSSETKQLPQHSDSGILHHQHYY 187 2580
HPRDESDEAKLKESNYGGSKRRTRQGRDPADLDMGIFVRTPSSQSEAELVAAGW
PAWMAAFAGEAIHGWIPRRAESFEKLYKIGQGTYSNVYKARDLDNGKIVALKKV
RFDSLDAESVRFMAREILVLRKLDHPNIVKLEGLVTSEVSSSLYLVFEYMEHDL
AGLAACPGIKFTEPQVKCYMQQLLQGLDHCHRHGVLHRDIKGSNLLIDNGGILK
IADFGLATFFYPDQKQLLTSRVVTLWYRPPELLLGATDYGVAVDIWSAGCILAE
LLAGKPILPGRTEVEQLHKTFKLCGSPSEDYWKESKLPHATIFKPQHPYKSCIA
EAFKDFSPSALALLETLLAIEPGHRGEASGALKSEFFTTEPLSCDPSSLPKYPP
SKEFDAKLRAQETRRQRDVGVRGHGSEAARRTSRLSRAGPTPNEGAELTALTQK
QHSTSHATSNIGSEKPSTKKEDYTAGLHIDPPRPVNHSYETTGVSRAYDAIRGV
AYSGPLSQTHVSGSTSGKKPKRDHVKGLSGQSSLQPSKPFIVSDSRSERIYEKS
HVTDLSNHSRLAVGRNRDTTDPHKSLSTLMQQIQDGTLDGIDIGTHEYARAPVS
STKQKSAQLQRPSALKYVDNVQLQNTRVGSRQSDERPANKESDMVSHRQGQRIH
CSGPLLHPSANIEDLLQKHEQQIQQAVRRAHHGKREALSNKSSLPGKKPVDHRA
WVSSGKGNKESPYFKGKGNKELSDLKGGPTAKVTNFRQKVM
393 CDK type C MAVANPGQLNLQEAPSWGSRSVNCFEKLEQIGEGTYGQVYMAKEIETGEIVALK 220 1749
KIRMDNEREGFPITAIREIKLLKKLQHENVIKLKEIVTSPGPEKDEQGKSDGNK
YNGSIYMVFEYMDHDLTGLAERPGMRFSVPQIKCYMKQLLIGLHYCHINQVLHR
DIKGSNLLIDNNGILKLADFGLARSFCSDQNGNLTNRVITLWYRPPELLLGSTK
YGPAVDMWSVGCIFAELLYGKPILPGKNEPEQLTKIFELCGSPDESNWPGVSKL
PWYSNFKPQRQMKRRVRESFKNFDRHALDLVEKMLTLDPSQRISAKDALDAEYF
WTDPVPCAPSSLPRYEPSHDFQTKRKRQQQRQHDEMTKRQKISQHPPQQHVRLP
PIQNAGQGHLPLRPGPNPTMHNPPPQFPVGPSHYTGGPRGAGGQNRHPQNIRPL
HAAQGGGYNANRGYGGPPQQQGGGYPPHGMGNQGPRGGQFGGRGAGYSQGGPYG
GPVGGRGPNVGGGNRGPQFWSEQ
394 CDK type D MQNMEDNVQSSWSLHGNKEICARYEILERVGSGTYSDVYRGRRKADGLIVALKE 438 1748
VHDYQSSWREIEALQRLCGCPNVVRLYEWFWRENEDAVLVLEFLPSDLYSVIKS
GKNKGENGIPEAEVKAWMIQILQGLADCHANWVIHRDLKPSNLLISADGILKLA
DFGQARILEEPEAIYEVEYELPQEDIVADAPGERLMEEDDSVKGVRNEGEEDSS
TAVETNFGDMAETANLDLSWKNEGDMVMQGFTSGVGTRWYRAPELLYGATIYGK
EIDLWSLGCILGELLILEPLFSGTSDIDQLSRLVKVLGTPTEENWPGCSNLPDY
RKLCFPGDGSPVGLKNHVPSCSDSVFSILERLVCYDPAARLNAKEVLENKYFVE
DPYPVLTHELRVPSPLREENNFSEDWAKWKDMEADSDLENIDEFNVVHSSDGFC
IKFS
395 CDK type D MDLNQYPEDLNPELPEGTDNVDNPDNNKGSPVPSPHPPLKPLDPSERYRKGITL 240 1631
GQGTYGIVYKAFDTVTNKTVAVKKIHLGKAKEGVNVTALREIKLLKELSHPNII
QLIDAYPHKQNLHIVFEFMETDLEAVIKDRNLVFSPADIKSYLQMTLKGLAVCH
KKWVLHRDMKPNNLLIAADGQLKLGDFGLARLFGSPDRKFTHQVFAVWYRAPEL
LFGAKQYGPAVDIWATGCIFAELLLRKPFLQGVSDLDQIGKIFAAFGTPRQSQW
PDVASLPDFVEFQFVPAPSLRSLFPMASEDALDLLSKMFTLDPKNRITAQQALE
HRYFSSVPAPTRPDLLPKPSKVDSSRPPKHASPDGPVVLSPSKARRVMLFPNNL
AGILPKQVSQSTTGGTPIEFDMPTQKLREVCPRSRITESGKKHLKRKTMDMSAA
LDECAREQEGQEGKTILDPDHQRSAKKEKHM
396 Cyclin A MAGGQENCVRITRARAACVSKASAPVIQSQVDEKKSRKRAPKRAAVDDLAANAS 252 1604
GSQPKRRAVLGDVTNLHAAATDCLSTAEDQVDAPNPSIKGRARNKKKEARTSTK
VVKDEIHPESNPLADHSSNLSECQKPPAAKLAEQRSLRGVPSKAKQGGSSNSQS
CSKHTDIDKDHTDPQMCTTYVEDIYEYLRNAELKNRPSANFMETAQNDITPNMR
AILVDWLVEVSEEYKLVPDTLYLTVSYIDRYLSANPTSRHKLQLLGVSCMLIAS
KYEEVCPPHVEEFCYITDNTYTRDEMLSMERKILIFLNFEMTKPTTKSFLRRFV
RASQAGNKAPSLHMEFLANYLAELTLMECSFLQYLPSLIAASTVFLSRLTLDFL
TNPWNPTLAHYTGYKASQLKDCVMAIYNVQMNRKGSTLVAIREKYQQHKFKCVA
SLPPPPFIAERFFEDTPN
397 Cyclin A MTGTQASNVRITRARAAKSTLNNALPPLPPAQGKPRGKRAATESNISGFSVAAE 261 1817
PLKRRAVLSDVSNICKEAAAVDCLKKPKAVKVVSQNANAKGRGRGIPRNNKKIT
QEAEIKKETSPAICNVDDASAGNAIGDDKQNNNVNPLKEVQDNPKELNPIAEQI
SVHPHCKQSVEKPNEKEIVVSDNKAAIASLKQQSTLQSLRIPKQPKYSLKQGNP
VPLANLHEDVGRSSCSDFIDIDSEYKDPQMCTAYVTDIYANMRVVELKRRPLPN
FMETTQRDINANMRSVLIDWLVEVSEEYKLVPDTLYLTVSYIDRFLSANVVNRQ
RLQLLGVSCMLVASKYEEICAPPVEEFCYITDNTYKKEEVLEMEISVLNRLQYD
LTTPTTKTFLRRFIRAAQASCKVSSLHLEFMGNYLAELTLVEYDFLKYLPSLIA
AAAVFVARMTLDPMVHPWNSTLQHYTGYKVSDMRDCICAIHDLQLNRKGCTLAA
IREKYNQPKFKCVANLFPPPIISPQFLIDNEV
398 Cyclin B MAAPNQNALLINNNNRRPLVDIGNLVGALNAQCNISKNGARKRAFGDIGNLVED 167 1576
LDAKCTISKYWVRKRPRTNFGVNANKGASSSTQGQGIVVRGEQKAWDRIVWGNK
QSCAIKMNAQHVTATQRGTAISISDIIDSSVQDGGIKAPSQLKARKQTVRTVTA
TLTARSEDSLRDVLEVPPGIDDGDRDNPLAVVEYVEDIYHFYRKIEVRSCVPPD
YMTRQLEIKDSMRGVIIDWLIEVHRTFLLMPETLYLTVNIIDRYLSIQSVTRNE
LQLMGITAMFIASKYEEISPPKINDLVYITKDAYTSKQIVNMEHTILNRLKFKL
TVPTPYVFLVRFLKAAGPDKVMKNLAFFLVDLCLLHYKMIKYSPSMLAAAAVYT
AQCTLKKHPYWNKTLILHIGYSEAHLRECAHLMADLHLKAEGSNLKSVYKKYSY
PIFGSVAFLSPAKIPAGTVAAPAIDKCAHQIYLRNLR
399 Cyclin B MFPNKQTQGLVQNKKMASKAAQPKAMVPPQRVPPAANNRRALGDIGNIVADVGG 183 1598
KCNVTKDGVNGKPLAQVSRPITRSFGAQLLAQAAANKGISAANNQTQVPVVIPK
ADVRGNKQRRTSKSKDIPPTTVVTNESDDCVIIEQAQRIKPTCNHNVGAVGNKE
KPQLLTAKPKSLTASLTSRSAVALRGFRFDDEMTEAEEDPLPNIDVGDRDNQLA
VVEYVEDIYKFYRRTEQMSCVPDYMPRQQEINPKMRAVLINWLIEVHYRFGLMP
ETLYLTTNLIDRYLATQLVSRSNYQLVGATAMLLASKYEEIWAPEMNDFLDILE
NKFERKHVLVMEKAMLNKLKFHLTVPTPYVFLVRFLKAAASDEEMENLVFFLME
LSLMQYVMIKFPPSMLAAAAVYTAQITLKKTTVWNDVLKRHTGYSEIDLKECTR
LMVAFHQSSEESKLNVVFKKYSMPEYDSVALIKPAKLPA
400 Cyclin D MAPSFDCVANAYIESCEDQEKLRQNAQILAQSGENDVDEPVSMLVQRETHYMLP 98 1126
EDYLQRLRNRTLDVNVRREAVGWILKVHSFYNFGAPTAYLAVNYLDRFLSRHRM
PQGVKAWMIQLMAVACLSLAAKMEETQVPLPSDLQREDARFIFDARTIQRMELL
ILSTLQWGMRSITPFSFIDYFAYRAVQGHGHGHDATPKAVMSRAIELILSTTEE
IDFMEYRPSAIAAAALLCAAEEVVPLQAVHYKRALSSSITDVDKDKMFGCYNLI
QETIIEGGCYWTPMSLQSTEKTPVGVLDAAACLSNTPTSSYSVKPYASVTAAKR
RKLNEICSALLVSQAHPC
401 Cyclin D MAANFWTSSHCKELLDAEKVGIVHPLDKDQGLTQEDVKIIKINMSNCIRTLAQY 148 894
VKLRQRVVATAITYCRRVYTRKSFTEYDPQLVAPTCLYLASKAEESTVQAKLVI
FYMKKYSKHRYEIKDMLEMEMKLLEALDYYLVIYHPYRPLIQFLQDAGLNDLKV
TAWALVNDTYRTDLILTYPPYMIALACIYFACIMEEKDAQAWFEELRVDMNEIK
NISMEIVDYYDNYRVIPDEKMNSALNKLPHRF
402 Cyclin D MAPALSSSYECLSHLLCAEDASNVVGCWDEDESKIFCEEEEGFGIQHFPDFPVP 287 1363
DDDEIRVLVRKESQYMPGKSYVQSYQNLGLDFTARQNAIGWILKVHGSYNFGPL
TAYLSINYLDRFLSRNPLPKAKVWMLQLLSVACLSLAAKMEETQVPLLLDLQAE
EPDFLFEPRTIQRMELLVLSTLEWRMLSVTPFSFVDYFLQGGGGRKPPPRAMVA
RANELIFNTHTVLDFLEHRPSAIAAAAVICAAEEVLPLEAAQYKETILSCSLVD
KEWVFGSYNLIQEVLIEKFSTPKKAKSASSSIPQSPVGVLDAFCLSNNSNNTSL
EASLSVNLYASVAAKRRKLNDYCNTWRMFQHSTC
403 Cyclin D MAPNCIDCAPSDLFCAEDAFGVVEWGDAETGSLYGDEDQLHYNLDICDQHDEHL 251 1348
WDDGELVAFAEKETLYVPNPVEKNSAEAKARQDAVDWILKVHAHYGFGPVTAVL
SINYLDRFLSANQLQQDKPWMTQLAAVACLSLAAKMDETEVPLLLDFQVEEAKY
IFESRTIQRMELLVLSTLEWRMSPVTPLSYIDHASRMIGLENHHCWIFTMRCKE
ILLNTLRDAKFLGLLPSVVAAAIMLHVIKETELVNPCEYENRLLSAMKVNKDMC
ERCIGLLIAPESSSLGSFSLGLKRKSSTINIPVPGSPDGVLDATFSCSSSSCGS
GQSTPGSYDSNNSSILCISPAVIKKRKLNYEFCSDLHCLED
404 Cyclin- MPQIQYSEKYTDDTYEYRHVVLPPETAKLLPKNRLLNENEWRAIGVQQSRGWVH 229 510
dependent YAIHRPEPHIMLFRRPLNYQQNQQQQAGAQSQPMGLKAQ
kinase
regulatory
subunit
405 Cyclin- MDQIEYSEKYYDDTYEYRHVELPPDVARLLPKNRLLTENEWRGIGVQQSRGWVH 92 409
dependent YAIHCSEPHIMLFRRPLNYEQNHQHPEPHIMLFRRPLNCQPNHQPQAHHPT
kinase
regulatory
subunit
406 Cyclin- MDQIEYSEKYYDDTYEYRHVELPPDVARLLPKNRLLTENEWRGIGVQQSRGWVH 64 381
dependent YAIHCSEPHIMLFRRPLNYEQNHQHPEPHIMLFRRPLNCQPNHQPQAHHPT
kinase
regulatory
subunit
407 Cyclin- MPQIQYSEKYYDDTYEYRHVVLPPDVARLLPKNRLLNENEWRGIGVQQSRGWVH 68 349
dependent YAIHRPEPHIMLFRRHLNYQQNQQQQAQQQPAQAMGLQA
kinase
regulatory
subunit
408 Histone MALVETEPVTLIHPEEPKKFKKKPTPGRGGVISHGLTEEEARVKAIAEIVGAMV 125 1849
acetyltransferase EGCRKGEDVDLNALKAAACRRYGLSRAPKLVEMIAALPDGERAAVLPKLKAKPV
RTASGIAVVAVMSKPHRCPHIATTGNICVYCPGGPDSDFEYSTQSYTGYEPTSM
RAIRARYNPYVQTRSRIDQLKRLGHTVDKVEFILMGGTFMSLPADYRDYFIRNL
HDALSGHTSSNVEEAVCYSEHSATKCTGLTIETRPDYCLGPHLRQMLSYGCTRL
EIGVQSTYEDVARDTNRGHTVAAVADCFCLAKDAGFKVVAHMMPDLPNVGVERD
MESFREFFENPAFRADGLKIYPTLVIRGTGLYELWKTGRYRNYPPEQLVDIIAR
VLALVPPWTRVYRVQRDIPMPLVTSGVEKGNLRELALARMDDLGLKCRDVRTRE
AGIQDIHHKIRPEVVELVRRDYCANEGWETFLSYEDTRQDILVGLLRLRKCGHN
TTCPELKGRCSIVRELHVYGTAVPVHGRDADKLQHQGYGTLLMEQAERIAWKEH
RSIKIAVISGVGTRHYYRKLGYELEGPYMMKYLN
409 Histone MLGFRDLYTSICEHLQRASGRLPIIAAATSLISTPEIAAVEKENKAPNSVDKMG 70 1602
acetyltransferase MGSADESGRFSTSNGQFMNMNNGVVKEEWKGGVPVVPSAPTTVPVITNVKLETP
SSPDHDMARKRKLGFLPLEVGTRVLCKWRDGKFHPVKIIERRKLPNGATNDYEY
YVHYTEFNRRLDEWVKLEQLELDSVETDADEKVDDKAGSLKMTRHQKRKIDETH
VEGNEELDAASLREHEEFTKVKNITKIELGRYEIETWYFSPFPSEYNNCEKLYF
CEFCLNFMKRKEQLQRHMRKCDLKHPPGDEIYRSGTLSMFEVDGKKNKVYAQNL
CYLAKLFLDHKTLYYDVDLFLFYILCECDERGCHMVGYFSKEKHSEESYNLACI
LTLPPYQRKGYGKFLISFSYELSKKEGKVGTPERPLSDLGLLSYRGYWTRVLLD
ILKKHKSNISIKELSDMTAIKADDVLSTLQGLDLIQYRKGQHAICADPKVLDRH
LKAVGRGGLEVDVCKLIWTPYKEQ
410 Histone MGSLDESTCSEEIRDEGKDSIRTKFKVESTVNNAQNGGNDNSKKKRAAGLPLEV 140 1465
acetyltransferase GIRLLCKWRDSKLHPVKIIERRKLPNGFPQDYEYYVHYTEFNRRLDEWVKLEQF
ELDSVETDADEKIEDKGGSLKMTRHQKRKIDEIHVEEGQGHEDFDPASLREHEE
FTKVKNIAKVELGRYEIETWYFSPFPPEYSHCEKLFFCEFCLNFMKRKEQLQRH
MRKCDLKHPPGDEIYRNGTLSMFEVDGKKNKIYGQNLCYLAKLFLDHKTLYYDV
DLFLFYVLCECDDRGCHVVGYFSKEKHSDEAYNLACILTLPPYQRKGYGKFLIA
FSYELSKKEGKVGTPERPLSDLGLLSYRGYWTRILLDILKKQRGNISIKELSDM
TAIKVEDVISTLQVLDLIQYRKGQHVICADPKVLDRHLKAAGIAGLEVDVSKLI
WTPYKEQCG
411 Histone MASAPMVGCDDSRDKHRWVESKVYMRKGHGKGSKGNAGFNAQNSTAQVRRENDN 628 2565
acetyltransferase MGNSIADNGKSEAASEGLSSLSRKQITVNQDHPPNETSSMPAVGGLQNIDTHVT
FKLEGCSKQEIWELRKKLTNELEQVRGTFKKLEARELQLRGYSVSAGVNTSYSA
SQFSGNDMRNNGGKEVTSEVASGGAITPKQAQRESNPPRQLSISLMENNQAASD
MGEKGKRTPKANQYYRNSEFVLGKDKFPPAESKKSKSTGNKKISQSKVFSKETM
QVGKEFMPQKSVNEVFKQCSLLLTKLMKHKYGWVFNLPVDAQALGLHDYHTIIK
RPMDLGTVKSKLEKNLYNSPASFAEDVKLTFSNAMTYNPKGHEVHTMAEQLLQL
FEERWKTIYEEHLDGKMRFGSGQGLGASSSTKKLPFQDSKKNIKKSEPAGGPSP
PKPKSTNHHASRTPSAKKPKAKDPHKRDMTYEEKQKLSTNLQNLPQERLELIVQ
IIKKRNPSLCQHDEEIEVDIDSFDTETLWELDRFVTNYKKSLSKNKKKALLADQ
AKRASEHGSARNKHPMIGRELPMNNKKGEQGEKVVEIDHMPPVNPPVVEVEKDG
VYAKRSSSSSSSSSDSGSSSSDSDSGSSSGSESDAYAATSPPAGSNTSARG
412 Histone MEGHSGALGFGQGFSRSSQSPNLSPSPSHSASASVTSSGQKRKRNEVEHAGVAS 55 1818
acetyltransferase NSTGMFAVPPSHIYSHLHPMSMSMPMPMHNSHPSSLSESRDGALTSNDDDDNLT
GGNQSQLDSMSAGNTDGREDFDDEDDDDDDEEDDDEVEGDEEDQDHDPDADDDS
DDGHDSMRTFTAARLDNGAPNSRNLKPKADAAGVAIAPTVKTEPILDTVKEEKV
SGNNNNNSVSANNAQVAPSGSAVLLSAVKEEANKPTSTDHIQTSGAYCAREESL
KREEDADRLKFVCFGNDGIDQHMIWLIGLKNIFARQLPNMPKEYIVRLVMDRSH
KSVMIIKQNQVVGGITYRPYLSQKFGEIAFCAITADEQVKGYGTRLMNHLKQHA
RDVDGLTHFLTYADNNAVGYFIKQDFTKEIKLEKERWHGYIKDYDGGILMECKI
DPKLPYTDLPAMIRWQRQTIDEKIRELSNCHIVYSGIDIQKKEAGIPRKPIKVE
DIPGLKEAGWTTDQWGHSRFRLLNSPSEGLPNRQVLHAFMRSLHKAMVEHADAW
PFKEPVDPRDVPDYYDIIKDPMDVKRMFTNARTYNTHETIYYKCANR
413 Histone MEESGNSLTSGPDGSKRRVSYFYDSDIGNYYYSQGHPMKPHRIRMAHSLIVHYA 259 1710
deacetylase LDEKMEVCRPNLLQSRELRVFHADDYISFLQSVTPETQHEQLRQLKRFNVGEDC
PVFDGLYNFCQTYAGGSVGAAIKLNNKEADIAINWSGGLHHAKKCEASGFCYVN
DIVLAILELLKVHQRVLYIDIDIHHGDGVEEAFYSTDRVMSVSFHKFGDYFPGT
GHLKDVGYGKGKYYSLNVPLNDGIDDESYKNLFRPIIQKVMEIYQPEAVVLQCG
ADSLSGDRLGCFNLSVKGHADCVRFLRSFNVPLVLVGGGGYTIRNVARCWCYET
AVAVGVEPQDKLPYNEYYEYFGPDYTLHVAPSNMENQNSAKELAKIRNTLLEQL
KRIQHVPSVPFQERPPDTKFPEEDEEDYEKRPKGHKWGGEYFGSESDEEQKPQN
RDIDISDKPGIRRQSPPNVEAAKKIKVEEEDGDIGIVNENDGAKWPLGEAG
414 Histone MEESGNSLTSGPDGSKRRVSYFYDSDIGNYYYSQGHPMKPHRIRMAHSLIVHYA 356 1807
deacetylase LDEKMEVCRPNLLQSRELRVFHADDYISFLQSVTPETQHEQLRQLKRFNVGEDC
PVFDGLYNFCQTYAGGSVGAAIKLNNKEADIAINWSGGLHHAKKCEASGFCYVN
DIVLAILELLKVHQRVLYIDIDIHHGDGVEEAFYSTDRVMSVSFHKFGDYFPGT
GHLKDVGYGKGKYYSLNVPLNDGIDDESYKNLFRPIIQKVMEIYQPEAVVLQCG
ADSLSGDRLGCFNLSVKGHADCVRFLRSFNVPLVLVGGGGYTIRNVARCWCYET
AVAVGVEPQDKLPYNEYYEYFGPDYTLHVAPSNMENQNSAKELAKIRNTLLEQL
KRIQHVPSVPFQERPPDTKFPEEDEEDYEKRPKGHKWGGEYFGSESDEEQKPQN
RDIDISDKPGIRRQSPPNVEAAKKIKVEEEDGDIGIVNENDGAKWPLGEAG
415 Histone MEFWGVEVKPGEALTCDPGDERYLHMSQAAIGDKEGAKENERVSLYVHVDGKKF 261 1298
deacetylase VLGTLSRGKCDQIGLDLVFEKEFKLSHTSQTGSVFVSGYTTVDHEALDGFPDDE
DLESSEDEEEELAQITTLTAKENGGKTGAKPVKPESKSSVTDKAAAKGKPEVKP
PVKKQEDDSDSDEDEDEDEDEDEDDDDEDDEDMKDASASDDGDEEDDSDEESDD
DEEEDEETPKPAAGKKRPMPASDNKSPATDKKAKITTPAGGQKPGADKGKKTEH
IATPYPKHGAKGPASGVKGKETPLGSKQTPGSKVKNSSTPESGKKSGQFKCQSC
SRDFATEGALSSHNAAKHGGK
416 Histone MMETGGNSLPSGPDGVKRKVAYFYDPEVGNYYYGQGHPMKPHRIRMTHALLVQY 365 2251
deacetylase GLHKEMQILKPYPARDRDLCRFHADDYVAFLRGITPETIQDQVKALKRFNVGDD
CPVFDGLYQYCQTYAGGSVGGAVKLNHKLCDIAINWAGGLHHAKKCEASGFCYV
NDIVLAILELLKYHKRVLYVDIDIHHGDGVEEAFYTTDRVMTVSFHKFGDYFPG
TGDIRDIGCGKGKYYAVNVPLDDGIDDESFQSLFKPIIQQVMLVYNPEAIVLQC
GADSLSGDRLGCFNLSVKGHAECVRYMRSFNVPLLMVGGGGYTVRNVARCWCYE
TGVAVGVEIDDKMPQHEYYEYFGPDYTVHVAPSNMENKNTKQYLDKIRSKILEN
INSLPCAPSAQFQVQPPDTDFPELEEEDYDERTRSHKWDGASCDSDSENGDLKH
RNHDVEESAFPRHNLANISYNTKIKLEGVGTGGLDMAAGTDTKKNDESFEAMDY
ESGEELRQDHFASTINASQPCDPALLTGVQNQLQSTDTVKPIEQSGNAPGIPPP
SVATVSTGTRPSSISRTSSLNSMSSVKQGSILGPNPPQGLNASGLQFPVPTSNS
PIRQGGSYSITVQAPDKQGLQNHMKGPQNMPGNS
417 Histone MPPKDRVAYFYDGDVGSVYFGPNHPMKPHRLCMTHHLVLSYELHKKMEIYRPHK 156 1454
deacetylase AYPVELAQFHSADYVEFLHRITPDTQHLFTKELVKYNMGEDCPVFENLFEFCQI
YAGGTIDAAHRLNNQICDIAINWSGGLHHAKKCEASGFCYINDLVLGILELLKH
HARVLYVDIDVHHGDGVEEAFYFTDRVMTVSFHKYGDMFFPGTGDVKEVGEREG
KYYAINVPLKDGIDDASFTRLFKTIITKVVDIYQPGAIVLQCGADSLAGDRLGC
FNLSIDGHAQCVRIVKKFNLPLLVTGGGGYTKENVARCWSVETGVLLDTELPNE
IPDNDYIKYFAPDYSLKINTAGNMENLNSKTYLSAIKVQVMENLRAIQHAPSVQ
MHEVPPDFYIPDIDEDELNPDERMDQHTQDRQIQRDDEYYDGDNDIDHDMEEAS
418 Histone MDSSKSEEANILHVFWHEGMLNHDLGTGVFDTLEDPGFLEVLEKHPENADRVRN 203 1348
deacetylase MLSILRKGPIAPYTEWHTGRAAYLSELYSFHRPDYVDMLAKTSTAGGKTLCHGT
RLNPGSWEAALLAAGTTLEAMRYILDGHGKLSYALVRPPGHHAQPTQADGYCFL
NNAGLAVELAVASGCKRVAVVDIDVHYGNGTAEGFYERDDVLTISLHMNHGSWG
PSHPQTGFHDEVGRGKGLGFNLNVPLPNGTGDKGYEHAMHELVVPAISKFMPEM
IVLVIGQDSSAFDPNGRECLTMEGYRKIGQIMRQQADQFSGGRLVVVQEGGYHI
TYAAYCLHATLEGVLCLPHPLLSDPIAYYPEHDIYSERVTFIKNYWQGIISTTD
KRN
419 Histone MEESGNALVSGPDGSKRRVTYFYDADIGNYYYGQGHPMKPHRMRMAHNLIVHYG 229 1644
deacetylase LHQRMEVCRPHLAQSKDIRAFHTDDYIHFLSSVAPDTQQEQLRQLKRFNVGEDC
PVFDGLFNFCQSSAGGSIGAALKLNRKDADIAINWAGGLHHAKKCEASGFCYVN
DIVLGILELLKVHQRVLYIDIDIHHGDGVEEAFYTTDRVMTVSFHKFGDYFPGT
GHIKDVGYGKGKYYALNVPLNDGIDDESYKHLFRPIIQKVMEVYQPEAVVLQCG
ADSLSGDRLGCFNLSVKGHADCVRFVRSFNIPLMLVGGGGYTIRNVARCWCYET
AVAVGVEPQDKLPYNEYYEYFGPDYTLYVAPSNMENLNTEKDLEKMRNVLLEQL
SKIQHTPSVPFQERPPDTEFNDEEEEDMEKRSKCRIWDGEYVGSEPEEDGKLPR
FDADTYERSVLKHENKRLVPVSNVEPLKRIKQEEDGAAV
420 Histone MPPKDRVAYFYDGDVGSVYFGPNHPMKPHRLCMTHHLVLSYELHKKMEIYRPHK 156 1454
deacetylase AYPVELAQFHSADYVEFLHRITPDTQHLFTKELVKYNMGEDCPVFENLFEFCQI
YAGGTIDAAHRLNNQICDIAINWSGGLHHAKKCEASGFCYINDLVLGILELLKH
HARVLYVDIDVHHGDGVEEAFYFTDRVMTVSFHKYGDMFFPGTGDVKEVGEREG
KYYAINVPLKDGIDDASFTRLFKTIITKVVDIYQPGAIVLQCGADSLAGDRLGC
FNLSIDGHAQCVRIVKKFNLPLLVTGGGGYTKENVARCWSVETGVLLDTELPNE
IPDNDYIKYFAPDYSLKINTAGNMENLNSKTYLSAIKVQVMENLRAIQHAPSVQ
MHEVPPDFYIPDIDEDELNPDERMDQHTQDRQIQRDDEYYDGDNDIDHDMEEAS
421 Histone MDLNLVSHGEEEEGVRRRKVGIVYDERMCKHATPEDQPHPEQPDRIRVIWDKLN 27 2222
deacetylase SAGVLHKCVMVEAKEASEEQLAGVHSRKHIEVMKSIGTARYNKKKRDKLAASYS
SIYFSQGSSEAALLAAGSVVEISEKVASGELDAGVAIVRPPGHHAEADKAMGFC
LFNNIAIAAKHLVHERPELGVQEVLIVDWDVHHGNGTQHMFWTDPHVLYFSVHR
FDAGTFYPGGDDGFYDKIGEGKGAGYNINVPWEQGKCGDADYLAVWDHVLVPVA
KSYDPDMVLISGGFDAALGDPLGGCRLTPYGYSLMTKKLMEFAGGKIVLALEGG
YNLKSLADSFLACVEALLKDGPSRSSVLTHPFGSTWRVIQAVRKELSSFWPALN
EELQLPRLLKDASESFDKLSSSSSDESSASEDEKKIAEVTSIMEVSPDPSSILA
LTAEDIAQPLAGLKIEEAGTDSQRSSDHTLLDLTNDDTQKLKQFEGEIFVMIGD
EESVPSASSSKDQNESTVVLSKSNIKAHSWRLTFSSIYVWYASYGSNMWNPRFL
CYIEGGQVEGMAKRCCGSEDKTPPQRIQWKVVPHRMFFGRSYTNTWGSGGVSFL
DPNCSDTSEAHVCLYKITLAQFNDLLLQENNLNCGTEHPLVDLSSIDAIRNGNS
ILELIKDSWYGTLIYLGMEGGLPIVTFTCSVCDVEKFKHGQLPLCPPSSRYENI
LIRGLVQGKKLSEDDATAYIRAASTSPLL
422 Peptidylprolyl MADEDLDLSDVGEVEDEPGEEIESTPPLAVGQEKEINSLALKKKLLKVGTRWET 71 1759
isomerase PENGDEVTVHYTGTLPDGTKFDSSRDRGEPFTFKLGQGQVIKGWDQGIVTMKKG
ERALFTIPPELAYGSSGVRPTIPPNATLQFDVELLSWTNIVDVCNDGGILKRII
SEGEKYERPKDPDEVTVKYEAKLEDGTLVAKSPEEGVEFYVNDGHFCPAIAKAV
KTMKRGESVILTIKPTYAFGERGKDAEEGFAAIPPNATLTTSLELVSFKAVIAV
TEDKKVIKKILKEADGYDKPSDGTVVQIRYTAKLQDGTIFEKKGYEGEEPFQFV
VDEEQVIAGLDKAVETMKTGEIALITIGAEYGFGNFETQRDLAVIPPNSTLIYE
VEMISFTKEKESWDMDTTEKIEASKQKKEQGNSLFKVGKYQRAAKKYEKAAKYI
EHDSSFSAEEKKQSKVLKVSCNLNHAACRLKLKDFKEAVKLCSKVLELESQNVK
ALYRRAQAYIETADLDLAEFDIKKALEIEPQNREVQLEYKILKQKQIEYNKKDA
KLYGNMFAKLNKLEAFEGKVLS
423 Peptidylprolyl MADEGLELSDVAEVEDEPGEEFESAPPLVVGQEKELNSSGLKKKLLKAGTRCET 358 2040
isomerase PENGDEVTVHYTGTLLDGTKFDSSRDRGEPFTFNIGQGQVIKGWDQGIVTMKKR
EHALFTIPPELAYGASGMPPTIPPNATLQFDVELLSWTNIVDVCKDGGILKRII
SDGEKYERPKDPDEVTVKYEAKLEDGMLVAKSPEEGVEFYVNDGNFCPAIVKAV
KTMKKGENVTLTIKPAYAFGEQGKDAEEGFAAIPPNATITINLQLVSFKAVKEV
TEDKKVIKKILKEADGYDKPSDGTVVQIRYTAKLQDGTIFEKKGYAGEEPFQFV
VDEEQVIAGLDKAVETMKTGEVALITIGPEYGFGNIETQRDLAVIPPYSTLIYE
VEMVSFTKEKESWDMNTTENIEASKQKKEQGNSLFKVGKYLRAAKKYDKAAKYI
EHDNSFSAEEKKQSKVLKVSCNLNHAACCLKLKDFKKAVKLCSKVLELESQNVK
ALYRRAQAYIETADLDLAEFDIKKALEIEPQNREVRLEYLILKQKQIEYNKKDA
KLYGNMFARQNKLEAIEGKD
424 Peptidylprolyl MPNPKVFFDMQVGGAPAGRIVMELYADVVPKTAENFRALCTGEKGTGRSGKPLH 238 756
isomerase FKGSSFHRVIPGFMCQGGDFTRGNGTGGESIYGEKFADENFVKKHTGPGILSMA
NAGPNTNGSQFFICTAQTSWLDGKHVVFGQVVEGLEVVRDIEKVGSGSGRTSKP
VVIADSGQLA
425 Peptidylprolyl MPNPKVFFDMQVGGAPAGRIVMELYADVVPKTAENFRALCTGEKGNGRSGKPLH 238 756
isomerase FKGSSFHRVIPGFMCQGGDFTRGNGTGGESIYGEKFADENFVKKHTGPGILSMA
NAGPNTNGSQFFICTAQTSWLDGKHVVFGQVVEGLEVVRDIEKVGSGSGRTSKP
VVIADSGQLA
426 Peptidylprolyl MPNPKVFFDMQVGGAPAGRIVMELYADVVPKTAENFRALCTGEKGTGRSGKPLH 238 756
isomerase FKGSSFHRVIPGFMCQGGDFTRGNGTGGESIYGEKFADENFVKKHTGPGILSMA
NAGPNTNGSQFFICTAQTSWLDGKHVVFGQVVEGLEVVRDIEKVGSGSGRTSKP
VVIADSGQLA
427 Peptidylprolyl MPNPKVFFDMQVGGAPAGRIVMELYADVVPKTAENFRALCTGEKGTGRSGKPLH 238 756
isomerase FKGSSFHRVIPGFMCQGGDFTRGNGTGGESIYGEKFADENFVKKHTGPGILSMA
NAGPNTNGSQFFICTAQTSWLDGKHVVFGQVVEGLEVVRDIEKVGSGSGRTSKP
VVIADSGQLA
428 Peptidylprolyl MADDFELPESAGMMENEDFGDTVFKVGEEKEIGKQGLKKLLVKEGGSWETPETG 176 1912
isomerase DEVEVHYTGTLLDGTKFDSSRDRGTPFKFKLGQGQVIKGWDQGIATMKKGENAV
FTIPPDLAYGESGSQPTIPPNATLKFDVELLSWASVKDICKDGGIFKKIIKEGE
KWEHPKEADEVLVKYEARLEDGTVVSKSEEGVEFYVKDGYFCPAFAIAVKTMKK
GEKVLLTVKPQYGFGHQGREAIGNDVARSTNATLLVDLELVSWKVVDEVTDDKK
VLKKILKQGEGYERPNDGAVVKVKYTGKLEDGTIFEEKGSDEEPFEFMAGEEQV
VDGLDRAVMTMKKGEVALVSVAAEYGYQTEIKTDLAVVPPKSTLIYEVELVSFV
KEKESWDMNTAEKIEAAGKKKEEGNALFKVGKYFRASKKYEKATKYIEYDTSFS
EEEKKQSKPLKVTCNLNNAACKLKLKDYTQAEKLCTKVLEVESQNVKALYRRAQ
AYIQTADLELAELDIKKALEIDPNNRDVKLEYRALKEKQKEYNKKEAKFYGNMF
ARMSKLEELESRKSGSQKVETANKEEGSDAMAVDGESA
429 Peptidylprolyl MAASLTPLGAGLAYATIYDQAKVRKLEPTKRSLIALCQHSDSQHRRFITRKYHV 64 765
isomerase NVQILNRRDAIRLIGLAAGLCIDLSLMYDARGAGLPPQENAKLCDTTCEKELEN
APMITTESGLQYKDIKIGNGPSPPIGFQVAANYVAMVPSGQVFDSSLDKGQPYI
FRVGSGQVIKGLDEGLLSMKVGGKRRLYIPGPLAFPKGLNSAPGRPRVAPSSPV
IFDVSLEFIPGLESEEE
430 Peptidylprolyl MSAASLSADMAIRGTILGKTALHVLGPQVVSQCRQPVMFKCPPHTLRKMRFSAQ 93 881
isomerase DLQSKNFYSGFTPFKSVFISTSKRSWQAGSARAMSQDAAFQSKVTTKCFLDIEI
GGDPAGRIVLGLFGEDVPKTAENFRALCTGEKGFGYKGSSFHRIIKDFMLQGGD
FDRGDGTGGKSIYGRTFEDENFKLAHVGPGVLSMANAGPNTNGSQFFICTVKTP
WLDKRHVVFGQVIEGMEIVKKLESEETNRTDRPKRPCRIVDCGELP
431 Peptidylprolyl MGRIKPQTLLQQSKKKKVPGRISVSTIIVCNLIIIFLMFSLVGIYRQRAKRNRA 372 1070
isomerase TSRSDGDEEMENFGRSKINSVPHQAIVNTTKGLITLELFGKSSAHTVEKFVEWS
ERGYFNGLPFYRVIKHFVIQVGDPKFAGNREDWTVGGQLNVQLEFSPKHEAFML
GTSKLEDQGDGFELFITTAPIPDLNDKLNVFGRVIKGQDVVQEIEEVDTDEHFQ
PKSPIIINDVRLKDEL
432 Peptidylprolyl MARQSTLLLFWSLVFLGAIVFTQAKHEELEEVTHKVYFDVDIAGKPAGRVVIGL 28 594
isomerase FGKAVPKTVENFRALCTGEKGVGKSGKPLHYKGSFFHRIIPSFMIQGGDFTLGD
GRGGESIYGTKFADENFKLKHTGPVFITTVTTDWLDGRHVVFGKIISGMDVVYK
VEAEGRQSGQPKRKVKIADSGELSMD
433 Peptidylprolyl MARQSTLLLFWSLVFLGAIVFTQAKHEELEEVTHKVYFDVDIAGKPAGRVVIGL 34 648
isomerase FGKAVPKTVENFRALCTGEKGVGKSGKPLHYKGSFFHRIIPSFMIQGGDFTLGD
GRGGESIYGTKFADENFKLKHTGPGFLSMANAGPDTNGSQFFITTVTTDWLDGR
HVVFGKIISGMDVVYKVEAEGRQSGQPKRKVKIADSGELSMD
434 Peptidylprolyl MEMDEIQEQSQPQSSEKQDISQESDTGNDKTINAEKITSENAEVEEDDMLPPKV 481 1611
isomerase NTEVEVLHDKVTKQIIKEGSGNKPSRNSTCFLHYRAWAESTMHKFQDTWQEQQP
LELVLGREKKELSGFAIGVAGMKAGERALLHVDWQLGYGEEGNFSFPNVPPRAN
LIYEAELIGFEEAKEGKARSDMTVEERIEAADRRRQQGNELFKEDKLAEAMQQY
EMALAYMGDDFMFQLFGKYKDMANAVKNPCHLNMAQCLLKLNRYEEAIGQCNMV
LAEDEKNIKALFRRGKARATLGQTDDAREDFQKVRKFSPEDKAVIRELRLLAEH
DKQVYQKQKEMFKGLFGQKPEQKPKKLHWFVVFWQWLLSMIRTIFRMRSKTD
435 Peptidylprolyl MAGAGEGTPEVTLETSMGPITVELYHKHAPKTCRNFLELSRRGYYNNVKFHRVI 93 584
isomerase KDFMVQGGDPTGTGRGGESIYGPRFEDEITRDLKHTGAGILSMANAGPNTNGSQ
FFISLAPTPWLDEKHTIFGRVCKGMDVVKRLGNVQTDKNDRPIHDVKILRTTVKD
436 Peptidylprolyl MMDPELMRLAQEQMSKISPDELMKMQRQIMANPDLMRMASENMKNLKPEDIRFA 250 1869
isomerase AEQMKNVRKEEMAEISERISRASPEEIEAMKARANLQSAYQLQVAQNLKDQGNQ
LHARMKYSEAAEKYLQARNNLTGIPFSEAKSLLLASSSNLMSCYLKTGQYEECV
QTGSEVLAYDAMNVKALYRRGQAYKQIGKLELAVADLRKAVEVSPEDETIAQAL
REASTELMEKGGTQDQNGPRIEEIIEEEAVQPTAEKYPQSAPMVTSVTEDVSDD
EQGSEDQNGFSRDSFQATNAPDGQMYAESLRNLTENPDMLRTMQSLMKNVDPDS
LVALSGGKLSPDMVKTVSGMFGRMSPEEIQNMMKMSSTLSRQNPSTSSRFDDIT
RGHSNMDSSPQSVSVDNDLFEENQNRVGESSTNLSSSAAFSGMPNFSAEMQEQV
RNQMNDPATRQMFTSMIQNMSPEMMASMSEQFGVKLSPEDAVKAQNAMASLSPN
DLDRLMNWATRLQTAIDYARKIKNWILGRPGLIFAISMLLLAIILHRFGYIGD
437 Peptidylprolyl MGVEKEILRPGNGPKPRPGQSVTVHCTGYGKNEDLSQKFWSTKDPGQKPFTFTI 84 422
isomerase GQGRVIKGWDEGVLDMQLGEIFKLRCSPDYGYGSNGFPAWGIRPNSVLVFEIEV
LSVN
438 Peptidylprolyl MPNPRCYLDITIGEELEGRILVELYSDVVPKTAENFRALCTGEKGIGPHTGVPL 128 1213
isomerase HYKGLPFHRVIKGFMIQGGDISAQNGTGGESIYGLKFDDENFQLKHERRGMLSM
ANSGPNTNGSQFFITTTRTSHLDGKHVVFGKVIKGMGVVRGIEHTPTESNDRPS
LDVVISDCGEIPEGSDDGIANFFKDGDLYPDWPADLDEKSAEISWWMNAVDSAK
CFGNENYKKGDYKMALRKYRKALRYLDICWEKEEIDEEKSNHLRKTKSQIFTNS
SACKLKLGDLKGALLDTEFAMRDGEDNVKALFRQGQAYMALKDVDSAVASFKKA
LQLEPNDAGIRKELAVATKMINDRRDQERRAYARMFQ
439 Peptidylprolyl MGDVIDLNGDGGVLKTIIRSAKPGAMQPTEDLPNVDVHYEGTLADTGEVFDTTR 265 837
isomerase EDNTLFSFELGKGTVIKAWDIAVKTMKVGEVARITCKPEYAYGSAGSPPDIPEN
ATLIFEVELVACKPRKGSTFGSVSDEKARLEELKKQREIAAASKEEEKKRREEA
KATAAARVQAKLEAKKGQGRGKGKSKGK
440 Peptidylprolyl MGLGLKIASASFLPIFNIMATRSLCILLVCFIPVLAHVLSLQDPELGTVRVYFQ 38 781
isomerase TTYGDIEFGFFPHVAPKTVEHIYKLVRLGCYNSNHFFRVDKGFVAQVADVVGGR
EVPLNSEQRKEGEKTIVGEFSEVKHVRGILSMGRYSDPDSASSSFSILLGNAPH
LDGQYAVFGKVTKGDDTLKRLEEVPTRQEGIFVMPLERIRILSTYYYDTNERES
NLTCDHEVSILKRRLVESAYEIEYQRRKCLP
441 Peptidylprolyl MASKRSLRTMNVWPTLPPLVLLLLLCFSSMSSSVVAKKSDVSELQIGVKHKPKS 38 526
isomerase CDIQAHKGDRIKVHYRGSLTDGTVFDSSFERGDPIEFELGSGQVIKGWDQGLLG
MCVGEKRKLRIPSKLGYGAQGSPPKIPGGATLIFDTELVAVNGKGISNDGDSDL
442 Peptidylprolyl MSGAPAERPISYFDITIGGKPIGRIVFSLYADLVPKTAENFRALCTGEKGIGKS 37 1158
isomerase GKPLCYAGSGFHRVIKGFMCQGGDFTAGNGTGGESIYGEKFEDEAFPVKHTKPF
LLSMANAGKDTNGSQFFITVSQTPHLDDKHVVFGEVIKGKSIVRAIENYPTASG
DVPTSPIIISACGVLSPDDPSLAASEETIGDSYEDYPEDDDSDVQNPEVALDIA
RKIRELGNKLFKEGQIELALKKYLKSIRYLDVHPVLPDDSPPELKDSYDALLAP
LLLNSALAALRTQPADAQTAVKNATRALERLELSDADKAKALYREASAHVILKQ
EDEAEEDLVAASQLSPEDMAISSKLKEVKDEKKKKREKEKKAFKKMFSS
443 Peptidylprolyl MASSLRSSLFSSWALDSKSVCSLFNLNPGKMGLPSISTPLNWRTCCCSHSSELL 61 768
isomerase ELNEGLQSSRRKTVMGLSTVIALSLVYCDEVGAVSTSKRALRSQKVPEDEYTTL
PNGLKYYDLKVGSGTEAVKGSRVAVHYVAKWKGITFMTSRQGMGITGGTPYGFD
VGASERGAVLKGLDLGVQGMRVGGQRILIVPPELAYGNTGIQEIPPNATLEFDV
ELISIKQSPEGSSVKIVEG
444 WD40 repeat MGAIEDEEPPLKRLKVSSPGLRRGLEEEAPSLSVGSVSILMAKSLSLEEGETVG 421 2172
protein SKGLIRRVEFVRIITQALYSLGYQKAGALLEEESGILLQSSNVALFRKQILDGK
WDESVVTLRGIDQVEVEGNTLKAASFLILQQKFFELLDKGNIPEAMKTLRLEIS
PMQLNTKRVHELASCIVFPSRCEELGYSKQGNPKSSQRMKVLQEIQQLLPPSIM
IPEKRLERLVEQALNVQREACIFHNSLDPALSLYTDHQCGRDQIPTTTLQVLES
HKNEVWFLQFSNNGKYLASASKDCSAIIWEITEGDSFSMKHRLSAHQKPVSFVA
WSPDDKLLLTCGIEEVVKLWNVETGECKLTYDKANSGFTSCGWFPDGERFISGG
VDKCIYIWDLEGKELDSWKGQGMPKISDLAVTSDGKEIISICGDNAIVMYNLDT
KTERLIEEESGITSLCVSKDSRFLLLNLANQEIHLWDIGARSKLLLKYKGHRQG
RYVIRSCFGGSDLAFVVSGSEDSQVYIWHRGNGELLAVLPGHSGTVNCVSWNPV
NPHVFASASDDYTIRIWGVNRNTFRSKNASSSNGVVHLANGGP
445 WD40 repeat MPGTTAGAGIEPIEPQSLKKLSLKSLKRSFDLFASLHGEPQPPDQRSQRIRIAC 163 1647
protein KVRAEYEVVKNLPTLPQREVGSSVSNSNVGETHSSLTTNQAQGFPTDTSGDLSK
DEGKEITSIAVHLQPQTGLIDGKAGAIAGTSTAISSVGSSDRYQPSAAIMKRLP
SKWPRPIWHPPWKNYRVISGHLGWVRSVAFDPGNEWFCTGSADRTIKIWEVATG
KLKLTLTGHIEQIRGLAVSSRHPYLFSAGDDKQVKCWDLEYNKAIRSYHGHLSG
VYCLALHPTLDILCTGGRDSVCRVWDIRTKAQIFALSGHENTVCSVFTQAIDPQ
VVTGSHDTTIKLWDLAAGKTMSTLTYHKKSVRAIAKHPFEHTFASASADNIKKF
KLPKGEFLHNMLSQQKTIVNAMAINEDNVLVSAGDNGSLWFWDWKSGHNFQQAQ
TIVQPGSLDSEAGIYALQYDITGSRLVSCEADKTIKMWKEDETATPESHPINFK
APKDIRRF
446 WD40 repeat MRPILMKGHERPLTFLKYNRDGDLLFSCAKDHTPTVWYGHNGERLGTYRGHNGA 192 1172
protein VWCCDVSRDSTRLITSSADQTAKLWNVETGAQLFSFNFESPARAVDLAIGDKLV
VITTDPFMELPSAIHIKRIEKDLSKQTADSVLTITGIKGRINRAVWGPLNSTII
SGGEDSVVRIWDSETGKLLRESDKETGHQKPITSLCKSADGSHFLTGSLDKSAR
LWDIRTLTLIKTYVTERPVNAVAISPLLDHVVIGGGQEASHVTTTDRRAGKFEA
KFFHKILEEEIGGVKGHFGPINSLAFNPDGRSFASGGEDGYVRLHHFDPDYFHI
KM
447 WD40 repeat MRPILMKGHERPLTFLKYNRDGDLLFSCAKDHTPTVWYGHNGERLGTYRGHNGA 131 1111
protein VWCCDVSRDSTRLITSSADQTAKLWNVETGNQLFSFNFESPARAVDLAIGDKLV
VITTDPFMELPSAIHIKRIEKDLSKQTADSVLTITGIKGRINRAVWGPLNSTII
SGGEDSVVRIWDSETGKLLRESDKETGHQKAITSLCKSADGSHFLTGSLDKSAR
LWDIRTLTLIKTYVTERPVNAVAISPLLDHVVIGGGQEASHVTTTDRRAGKFEA
KFFHKILEEEIGGVKGHFGPINSLAFNPDGRSFASGGEDGYVRLHHFDPDYFHI
KM
448 WD40 repeat MAENNVGDFIPLDRQEYPSKPAPGAVDSSFWKSFKKKEVSRQIAGVTCINFCPE 149 1726
protein PPHDFAVTSSTRVHIYDGKSCELKKTITKFKDVAYSGVFRSDSQIIAAGGETGV
IQVFNAKSQMVLRQLKGHGRPVRVVRYSPQDKLHLLSGGDDSMVKWWDITTQEE
LLNLEGHKDYVRCGAASPSSVNLWATGSYDHTVRLWDLRNSKTVLQLKHGKPLE
DVLFFPSGGLLATAGGNVVKVWDILGGGRPIHTMETHQKTVMAMCISKVPRSGQ
ALGDAPSRLVTASLDGYMKVFDLDHFKVTHSARYPAPILSMGISSLCRTMAVGT
SSGLLFIRQRKGQIEDKIHSDSSGLQVNPVNDEKDSAVLKPNQYRYYLRGRSEK
PSEGDYVVKRMAKVYFQEYDKDLRHFNHSKALVSALKAADSKGTVAVIEELVAR
KRLIQTLSILNLDELELLINFLSRFILVPKYSRFLISLTDRVLDARAVDLGKSE
NLKKQIADLKGIVVQELRVQQSMQELQGIIEPLIRASAR
449 WD40 repeat MDVETSGKPTGNKRTYTRLPRQVCVFWQEGRCTRESCNFLHVDEPGSVKRGGAT 948 2228
protein NGFAPKRSYNGSDERDTLAAGPPGGSRRNISARWGRGRGGIFISDERQKIRNKV
CNYWLAGNCQRGEECKYLHSFVMGSDVKFLTQLSGHVKAIRGIAFPSDSGKLYS
GGQDKKVIVWDCQTGQGTDIPLNDEVGCLMSEGPWIFVGLPNAVKAWNILTSTE
LSLVGPRGQVHALAVGNGMLFAGTHDGSILAWKFSPASNTFEPAASLVGHTQAV
VSLVSGADRLYSGSMDKTIRVWDLGTFQCLQTLRDHTSVVMSLLCWDQFLLSCS
LDNTVKVWVATSSGALEVTYTHNEEHGVLALCGMNDEQAKPVLLCSCNDNTVRL
YDLPSFSERGRIFSRNEVRTFQIAPGGLFFTGDATGELKVWNWATQKS
450 WD40 repeat MSVQELRERHAAATAKVNALRERIKAKRLQLLDTDVATYASSNGRTPISFSFTD 332 1465
protein LVCCRTLQGHTGKVYSLDWTSEKNRIVSASQDGRLIVWNALTSQKTHAIKLPCA
WVMTCAFSPSGQAVACGGLDSVCSIFQLNNQLDRDGHLPVSRILSGHRSYVSSC
QYVPDGDTHVITGSGDRTCIQWDVTTGQRIAIFGGEFPLGHTADVMSVSISAAN
PKEFVSGSCDTTTRLWDTRIASRAIRTFHGHEADVNTVKFFPDGLRFGSGSDDG
TCRLFDIRTGHQLQVYRQPPRENQSPTVTAIAFSFSGRLLFAGYSNGDCFVWDT
ILEKVVLNLGELQNTHNGRISCLGLSADGSALCTGSWDKNLKIWAFGGHRKIV
451 WD40 repeat MKVKIISRSTDEFTRERSNDLQRVFRNFDPNLHTQARAQEYVRALNAAKLDKIF 232 1590
protein AKPFLAAMSGHIDGISAMAKSPRHLKSIFSGSVDGDIRLWDIAARRTVQQFPGH
RGAVRGLTVSTEGGRLISCGDDCTVRLWDIPVAGIGESSYGSENVQKPLATYVG
KNSFRAVDYQWDSNVFATGGAQVDIWDHDRSEPTNSFAWGSDTVISVRFNPAEK
DIFATTASDRSIVLYDLRMASPLNKLIMQTRNNAIAWNPREPMNFTAANEDCNC
YSYDMRRMNISTCVHQDHVSAVMDIDYSPSGREFVTGSYDRTVRIFPYNAGHSR
EIYHTKRMQRVFCVKFSGDATYVVSGSDDANIRLWKAKASEQLGVLLPRERKRH
EYLDAVKERFKHLPEIKRIERHRHLPKPIYKAALLRHTVNAAAKRKEERKRAHS
APGSVVTNPLRKKRIVAQLE
452 WD40 repeat MDHYYQDDFDYLVDDEMVDFADDVEDDVRTRRRSDIDSDSENDFDLNNKSPDTT 207 1550
protein ALQAKRGKDIQGIPWNRLNFTREKYRETRLQQYKNYENLPRPRRSRNLDKECTN
FERGSSFYDFRHNTRSVKATIVHFQLRNLVWATSKHNVYLMQNYSIMHWSSLKQ
KGEEVLNVAGPIVPSVKHPGSSPQGLTRVQVSAMSVKDNLVVAGGFQGELICKY
LDKPGVSFCTKISHDENGITNAVEIYNDASGATRLMTANNDLAVRVFDTEKFTV
LERFSFPWSVNHTSVSPDGKLVAVLGDNADCLLADCKTGKTVGTLRGHLDYSFA
AAWHPDGYILATGNQDTTCRLWDVRKLSSSLAVLKGRMGAIRSIRFSSDGRFMA
MAEPADFVHLYDTRQNYTKSQEIDLFGEIAGISFSPDTEAFFVGVADRTYGSLL
EFNRRRMNYYLDSIL
453 WD40 repeat MAEALVLRGTMEGHTDAVTAIATPIDNSDMIVSSSRDKSILLWNLTKEPEKYGV 221 1171
protein PRRRLTGHSHFVQDVVISSDGQFALSGSWDSELRLWDLNTGLTTRRFVGHTKDV
LSVAFSIDNRQIVSASRDRTIKLWNTLGECKYTIQPDAEGHSNWISCVRFSPSA
TNPTIVSCSWDRTVKVWNLTNCKLRNTLVGHGGYVNTAAVSPDGSLCASGGKDG
VTMLWDLAEGKRLYSLDAGDIIYALCFSPNRYWLCAATQQCVKIWDLESKSIVA
DLRPDFIPNKKAQIPYCTSLSWSADGSTLFSGYTDGKIRVWGIGHV
454 WD40 repeat MAAIKSTSRSASVAFAPDAPLLAAGTMAGAIDLSFSSLANLEIFKLDFQSDDPE 221 3679
protein LPVVGECPSNERLNRLSWGSAGGSFGIIAGGLVDGTINIWNPATLINSEDNGDA
LIARLEQHTGPVRGLEFNTISTNLLASGAEDGELCIWDLANPTAPTHFPPLKGV
GSGAQGEISFLAWNRKVQHILASTSYSGTTVVWDLRRQKPIISFPDATRRRCSV
LQWNPDASTQLIVASDDDNSPTLRAWDLRNTISPYKEFVGHSRGVIAMSWCPSD
SLFLLTCAKDNRTLCWDTGSGEIVCELPAGANWNFDVQWSPKIPGILSTSSFDG
KIGIHNIEACSRNVSGEVEFGGAIVRGGPSALLKAPKWLERPAGVSFGFGGKLA
SFRPSTVAQAADHRHSEVFIHNLVTEDNLVIRSTEFEAAIADGEKVSLRALCDR
KAEESQSDEEKETWNFLRVMFEDEGTARTKLLEHLGFKVQSEENGDLQETHSSK
IDDIGSEIGKTLTLDDKTEEDVLPQLKGGQDAAIPQDNGEDFFDNLHSPKEEVS
LSHVGNDFVGEKDKDMVVNGAEIEHETEDLTEYSDWNEAIQHSLVVGDYKGAVL
QCLSANRMADALIIAHLGGNSLWEKTRDEYLKKAKSSYLKVVSAMVNNDLTGLV
NSRPLKSWKETLAMLCTYSQREEWTVLCDMLASRLIAAGNVMAATLCYICAGNI
EKTVEIWSRSLKYDYDGRSFVDHLQDVMEKTVVLALATGQKRVSPSLSKLVENY
AELLASQGLLTTAMEYLKLLGTEESSHELSILRDRLYLSGTDNKVEASSFPFET
RQDLTESQYNMHQTGFGAPETQKNYQENVHQVLPSGSYTDNYQPTANTHYIAGY
QPAPQQQPSFQNYFTPASYQPAPSPNVFYPSQVSQAEQSNFAPPVNQPPMKTFV
PSTPPILRNVDQYQTPSLNPQLYQGVSSATVETHPYQTGAPASVSVGTTPGQPS
VVPNFMVPGPVTAPTVTPRGFMPVTTPTQHPLGSANPPVQPQSPQSSQVQSVTA
ATTPPPTIQNVDTSNVAAEIRPVIGTLRRLYDETSEALGGARANPAKRREIEDN
SRKIGSLFAKLNSGDISSNAASKLVHLCQALESRDYATAFQIQVGLTTSDWDEC
SFWLAALKRMIKVKQNMR
455 WD40 repeat MAGAADSQLQTLSERDSTPNFKNLHTREYAAHKKKVHSVAWNCTGTKLASGSVD 269 1252
protein QTARVWNIEPHGHSKTKDLELKGHADSVDQLCWDPKHSELLATASGDRTVRLWD
ARSGKCSQQVELSGENINITFKPDGTHIAVGNRDDELTIIDVRKFKPLHKRKFS
YEVNEIAWNTTGELFFLTTGNGTVEVLSYPSLQVLHTLVAHTAGCYCIAIDPIG
RYFAVGSADALVSLWDLSEMLCVRTFTKLEWPVRTISFNHDGQYIASASEDLFI
DIADVQTGRTVHQISCRAAMNSVEWNPKYNLLAFAGDDKNKYMQDEGVFRVFGF
ETP
456 WD40 repeat MAATSPVGAGSGRELANPPTDGISNLRFSNHSDHLLVSSWDRKVRLYDASANSL 214 1242
protein KGQFVHGGPVLDCCFHDDASGFSGSADNTVRRYDFSTRKEDILGRHEAPVRCVE
YSYAAGQVITGSWDKTLKCWDPRGASGQEKTLVGTYSQLERVYSMSLVGHRLVV
ATAGRHINVYDLRNMSQPEQRRESSLKYQTRCVRCYPNGTGFALSSVEGRVAME
FFDLSEAGQAKKYAFKCHRKSEAGRDTVYPVNAIAFHPIYGTFATGGCDGYVNV
WDGNNKKRLYQYSKYPTSIAALSFSRDGRLLAVASSYTFEEGEKPHEPDAVFVR
SVNEAEVKPKPKVYAAPP
457 WD40 repeat MASDDEEGFKNEEAPGVVDEAEVQEGLRACFPLSFGKQEKKQAPLESIHSATKR 119 2065
protein PEDPRPRRQLGPPRPPPSILAEQEDSDRFVGPPRPPQFVRDDNDDGEAEIMIGP
PRPPAQYSDDHDNEETIGPPKPSYLEKGEETDQMVGPSKRGSDDETSGDSDDGD
DAVDFRVPLSNEIVLRGHTKVVSALAIDQTGSRVLTGSYDYSVRMYDFQGMTSQ
LKSFRQLEPAEGHQVRSLSWSPTSDRFLCVTGSAQAKIFDRDGLTLGEFVKGDM
YLRDLKNTKGHISGLTCGEWHPKEKQTILTCSEDGSLRIWDVNDFNTQKQVIKP
KLAKPGRVPVTACAWGRDGKCIAGGVGDGSIQVWNLKPGWGSRPDLYVAKGHDD
DITGLQFSADGNILLTRSTDETLKVWDLRKAITPLQVFRDLPNNYAQTNVAFSP
DERLIFTGTSVERDGNSGGLLCFYDRQTLELVLRIGVSPVHSVVRCTWHPRHNQ
VFATVGDKKEGGAHILYDPALSERGALVCVARAPRKKSLDDFEAKPVIHNPHAL
PLFRDEPSRKRQREKARMDPMKSQRPDLPVTGPGFGGRVGSTKGSLLTQYLLKE
GGLIKETWMEEDPREAILKYADVAAKDPKFIAPAYAQTQPETVFAETDSEEEQK
458 WD40 repeat MKERGQSHAGQPSVDERYTQWKSLVPVLYDWLANHNLVWPSLSCRWGPQMHQAT 186 1550
protein YKNSQRLYLSEQTDGTVPNTLVIATCEVVKPRVAAAEHISQFNEEARSPFVKKF
KTIIHPGEVNRIRELPQNSKIVATHTDGPDVLIWDVDTQPNRQATLGAADSRPD
LVLTGHKDNAEFALAMSPSAPFVLSGGKDKCVLLWSIQDHISAATEPSSAKASK
TPSSAHGEKVPKIPSIGPRGVYKGHKDTVEDVQFCPSNAQEFCSVGDDSALILW
DARNGNEPVIKVEKAHNADLHCVDWNPHDENLILTGSADNSVRMFDRRNLTSSG
VGSPVHKFEGHSAPVLCVQWCPDKASVFGSAAEDSYLNVWDYEKVGKNVGKKTP
PGLFFQHAGHRDKVVDFHWNSFDPWTIVSVSDDGESTGGGGTLQIWRMSDLIYR
PEDEVLAELERFRAHILSCQNK
459 WD40 repeat MSSLSRELVFLILQFLDEEKFKESVHKLEQESGFFFNMKYFDEKAQAGEWDEVE 244 3671
protein RYLSGFTKVDDNRYSMKIFFEIRKQKYLEALDRQDRAKAVDILVKDLKVFSTFN
EELYKEITQLLTLDNFRENEQLSKYGDTKSARTIMMSELKKLIEANPLFREKLI
YPNLKASRLRTLINQSLNWQHQLCKNPRPNPDIKTLFTDHACGPPNGARTPTQP
TASLGVLPKATTFTPIGPHGPFPSSSTATSGLASWMSNPNMVTSPQAPVAVGPS
VPVPPNQATLLKRPRTPPGSSSVVDYQTADSEQLIKRLRPVSQSIDEATYPGPT
LRVPWSTDDLPKTLARALNEPYPVTSIDFHPSQQTFLLVGTKNGEITLWEVGSR
EKLATRSFKIWDNANCSNHLEAAFVKDSSVSINRVLWSPDGTLIGIAFTKHLVH
TYTFQGLDLRQHLEIDAHVGGVNDLAFSHPNKQLCVVTCGDDKMIKVWDAVTGR
KLYNFEGHDAPVYSVCPHHKENIQFIFSTAVDGKIKAWLYDHLGSRVDYDAPGH
SCTTMMYSADGTRLFSCGTSKEGESFLVEWNESEGAIKRTYSGLRKKGSGVVQF
DTTQNHFLAVGDEHLIKFWDMDSTNMLTSCDAEGGLLNLPRLRFNKEGSLLAVT
TVNGIKILANADGQKLLKTMENRTFDLPSRAHIDAASATSSPATGRMERIERTS
SANTVSGINGVDPAQSSEKLRLSDDLSEKTKIWKLTEITDSIQCRCITLPENAA
EPASKVSRLLYTNSGVGLLALGSNAVHKLWKWNRSEQNPSGKATASVHPQRWQP
TSGLLMTNDITDINPEEAVPCIALSKNDSYVMSASGGKVSLFNMMTFKVMTTFM
PPPPASTFLAFHPQDNNIIAIGMEDSTIHIYNVRVDEVKTKLKGHQKRITGLAF
SSTQNILVSSGADAQLCVWNTETWEKRKSKTIQMPVGKTVSGDTRVQFHSDQLH
ILVVHETQLAIYDAYKLERQYQWVPQDALSAPILYATYSCNRQLIYATFSDGNI
GVYDAEILRPRCRIAPTTYLSSGTSSSTSLPLVVAAHPHEPNQFAIGLSDGAVQ
VLEPSESEGKWGVSPPPENGVVPAVVAGPSTSNQGSEQAPR
460 WD40 repeat MAKDEEEFRGEMEERLVNEEYKIWKKNTPFLYDLVITHALEWPSLTVQWLPDRE 163 1431
protein EPPGKDYSVQKMILGTHTSDNEPNYLMLAQVQLPLEDAENDARQYDDERGEIGG
FGCANGKVQVIQQINHDGEVNRARYMPQNPFIIATKTVSAEVYVFDYSKHPSKP
PQDGGCHPDLRLRGHNTEGYGLSWSPFKHGHLLSGSDDAQICLWDINVPAKNKV
LEAQQIFKVHEGVVEDVAWHLRHEYLFGSVGDDRHLLIWDLRTSATNKPLHSVV
AHQGEVNCLAFNPFNEWVLATGSADRTVKLFDLRKISSALHTFSCHKEEVFQIG
WSPKNETILASCSADRRLMVWDLSRIDEFQTPEDALDGPPELLFIHGGHTSKIS
DFSWNPCEDWVIASVAEDNILQIWQMAENIYHDEEDDMPPEEVV
461 WD40 repeat MSPGVKQTGSQKFESGHQDVVHDVTMDYYGKRIATCSADRTIKLFGLNASDTPS 155 1081
protein LLASLTGHEGPVWQVAWAHPKFGSMLASCSYDGRVIIWREGQQENEWSQVQVFK
EHEASVNSISWAPNELGLCLACGSSDGSITVFTCREDGSWDKTKIDQAHQVGVT
AVSWAPASAPGSLVGQPSDPIQKLVSGGCDNTAKVWKFYNGSWKLDCFPPLQMH
TDWVRDVAWAPNLGLPKSTIASCSQDGKVVIWTQGKEGDKWEGRILNDFKIPVW
RVNWSLTGNILAVADGNNSVTLWKEAVDGDWNQVTTVQ
462 WD40 repeat MSSGVKQTGSQKFESGHQDVVHDVTMDYYGKRIATCSADRTIKLFGMNTSDTPT 537 1463
protein LLASLTGHEGPVWQVAWAHPKFGSMLASCSYDRRVIIWREGQQENEWSQVQVFK
EHEASVNSISWAPHELGLCLACGSSDGSITVFTGREDGSWDKTKIDQAHQVGVT
AVSWAPASAPGSLVGQPSDPVQKLVSGGCDNTAKVWKFYNGSWKLDCFPPLQMH
TDWVRDVAWAPNLGLPKSTIASCSQDGRVVIWTQGKEGDKWEGKILNDFKTPVW
RISWSLTGNILAVADGNNNVTLWKEAVDGEWNQVTTVQ
463 WD40 repeat MKKRSRPSNGHLSTAAKNKSRKTAPITKDPFFDSAHNRNKSKGKGKSRGKGEEI 284 1909
protein FSSDEDDDAIGRDAPAEEEEEIAEEERETADEKRLRVAKAYLDKIRAITKANEE
DNEEEAGEDEETEAERRGKRDSLVAEILQQEQLEESGRVQRQLASRVVTPSKLV
ECRVVKRHKQSVTAVALTEDDLRGFSASKDGTIIHWDVETGASEKYEWPSQAVS
VSSSNEVSKTQKGKGSKKQGSKHVLSMAVSSDGRYLATGGLDRYIHLWDTRTQK
HIQAFRGHRGAVSCLAFRQGTQQLISGSFDRTIKLWSAEDRAYMDTLYGHQSEI
LAVDCLRKERVLSVGRDHTLRLWKVPEETQLVFRGHAASLECCCFINNEDFLSG
SDDGSIELWSMLRKKPVFMAKNAHGHAIVENLSEDTSTREEPDEEVTTRQLPNG
NSIGNGMTNQMGITPSVESWVGAVTVCRGTDLAASGAGNGVVRLWAIENSSKSL
RALHDIPLTGFVNSLTFARSGRFLIAGVGQEPRLGRWGRIQAARNGVTLCPIELS
464 WD40 repeat MAATFGTINTATSPHNPNKSFEIVQPPNDSISSLSFSPKANYLVATSWDNQVRC 610 1659
protein WEVLQTGASMPKAAMSHDQPVLCSTWKDDGTAVFSAGCDKQAKMWPLLTGGQPV
TVAMHDAPIKDIAWIPEMNLLATGSWDKTLKYWDTRQSNPVHTQQLPERCFALS
VRHPLMVVGTADRNLIIFNLQNPQTEFKRISSPLKYQTRCVAAFPDKQGFLVGS
IEGRVGVHHVEEAQQSKNFTFKCHRDSNDIYAVNSLNFHPVHQTFATAGSDGAF
NFWDKDSKQRLKAMARSNQPTPCSTFNSDGSLYAYAVSYDWSKGAENHNPATAK
HHILLHVPQESEIKGKPRVTTSGRK
465 WD40 repeat MVVMDKGTHQTNEDESESEFIDEDDVIDEISIDEEDLPDADVEGEDVQEDNKRS 241 1452
protein EPDENSSSLDDAIHTFEGHEDTLFAVACSPVDATWVASGGGDDKAFMWRIGHAT
PFFELKGHTDSVVALSFSNDGLLLASGGLDGVVRIWDASTGNLIHVLDGPGGGI
EWVRWHPKGHLVLAGSEDYSTWMWNADLGKCLSVYTGHCESVTCGDFTPDGKAI
CTGSADGSLRVWNPQTQESKLTVKGYPYHTEGLTCLSISSDSTLVVSGSTDGSV
HVVNIKNGKVVASLVGHSGSIECVRFSPSLTWVATGGMDKKLMIWELQSSSLRC
TCQHEEGVMRLSWSLSSQHIITSSLDGIVRLWDSRSGVCERVFEGHNDSIQDMV
VTVDQRFILTGSDDTTAKVFEIGAF
466 WD40 repeat MPVFRTAFNGYAVKFSPFVETRLAVATAQNFGIIGNGRQHVLELTPNGIVEVCA 223 1173
protein FDSSDGLYDCTWSEANENLVVSASGDGSVKIWDIALPPVANPIRSLEEHAREVY
SVDWNLVRKDCFLSASWDDTIRLWTIDRPQSMRLFKEHTYCIYAAVWNPRHADV
FASASGDCTVRIWDVREPNATIIIPAHEHEILSCDWNKYNDCMLVTGSVDKLIK
VWDIRTYRTPMTVLEGHTYAIRRVKFSPHQESLIASCSYDMTTCMWDYRAPEDA
LLARYDHHTEFAVGIDISVLVEGLLASTGWDETVYVWQHGMDPRAC
467 WD40 repeat MDSRNRRSRLNLPPGMSPSSLHLETTAGSPGLSRVNSSPSTPSPSRTTTYSDRF 251 1777
protein IPSRTGSRLNGFALIDKQPQPLPSPTRSAAEGRDDASSSSASAYSTLLRNELFG
EDVVGPATPATPEKSTGLYGGSRDSIKSPMSPSRNLFRFKNDHGGNSPGSPYSA
STVGSEGLFSSNVGTPPKPARKITRSPYKVLDAPALQDDFYLNLVDWSSNNVLA
VGLGTCVYLWSACTSKVTKLCDLGVNDSVCSVGWTPQGTHLAVGTNIGEVQIWD
TSRCKKVRTMGGHCTRAGALAWSSYILSSGSRDRNILHRDIRVQDDFIRKLVGH
KSEVCGLKWSYDDRELASGGNDNQLLVWNQQSAQPLLRFNEHTAAVKAIAWSPH
QHGILASGGGTADRCLRFWNTATDTRLNCVDTGSQVCNLVWCKNVNELVSTHGY
SQNQIMVWRYPSMSKLATLTGHTLRVLYLAISPDGQTIVTGAGDETLRFWSIFP
SPKSQSAVHDSGLWSLGRTHIR
468 WD40 repeat MEKKKVVVPIVCHGHSRPIVDLFYSPVTPDGLFLISASKDSSTMLRNGETGDWI 367 1419
protein GTFEGHKGAVWSCCLDNRALRAASGSADFSAKIWDALTGDELHCFVHKHIVRAC
AFSESTSLLLTGGHEKILRIFDLNRPDAPPKEVDNSPGSIRTVAWLHSDQTILS
SNSDAGGVRLWDLRTEKIVRVLETKSPVTSAEVSQDGRYITTADGNSVKFWDAN
HFGMVKSYTMPCMVESASLEPTMGNMFVAGGEDMWVRLFDFHTGEEIACNKGHH
GPVHCVRFAPGGESYSSGSEDGTIRIWQTLNMNSEENESYGVNGLSGKVRVGVD
DVVQKVEGFQITADGHLNDKPEKPNP
469 WD40 repeat MERYSQGTQKKSEIYTYEAPWQIYGMNWSVRKDKKFRLGIGSFLEEYNNRVEII 284 1303
protein ELDEESGEFKSDPRLAFDHPYPTTKIMFVPDKECQRPDLLATTGDYLRIWQVCE
DRVEPKSLLNNNKNSEFCAPLTSFDWNDADPKRIGTSSIDTTCTIWDIEKEVVD
TQLIAHDKEVYDIAWGEVGVFASVSADGSVRVFDLRDKEHSTIIYESSQPETPL
LRLGWNKQDPRFIATILMDSCKVVILDIRFPTLPVAELQRHQASVNTIAWAPHS
PCHICTAGDDSQALIWELSSVSQPLVEGGGLDPILAYTAAAEINQLQWSSMQPD
WVAIAFSNEVQILRV
470 WD40 repeat MQSENNLDESLHLREVQELQGHTDTVWAVAWNPVTGIDGAPSMLASCSGDKTVR 684 1784
protein IWENTHTLNSTSPSWACKAVLEETHTRTVRSCAWSPNGKLLATASFDATTAIWE
NVGGEFECIASLEGHENEVKSVSWSASGMLLATCGRDKSVWIWDVQPGNEFECV
SVLQGHTQDVKMVQWHPNRDILVSASYDNSIKVWAEDGDGDDWACMQTLGNSVS
GHTSTVWAVSFNSSGDRMVSCSDDLTLMVWDTSINPAERSGNAGPWKHLCTISG
YHDRTIFSVHWSRSGLIASGASDDCIRLFSESTDDSVTPVDGTSYKLILKKEKA
HSMDVNSVQWHPSEPQLLASASDDGRIKIWEVTRINGLANSH
471 WD40 repeat MKRAYKLQEFVAHASNVNCLKIGKKSSRVLVTGGEDHKVNMWAIGKPNAILSLS 336 2738
protein GHSSAVESVTFDSAEALVVAGAASGTIKLWDLEEAKIVRTLTGHRSNCISVDFH
PFGEFFASGSLDTNLKIWDIRRKGCIHTYKGHTRGVNSIRFSPDGRWVVSGGED
NIVKLWDLTAGKLMHDFKCHEGQIQCMDFHPQEFLLATGSADRTVKFWDLETFE
LIGSAGPETTGVRAMIFNPDGRTLLTGLHESLKVFSWEPLRCYDAVDVGWSKLA
DLNIHEGKLLGCSYNQSCVGVWVVDISRVGPYAAGNVSRTNGHNEAKLASSGHP
SVQQLDNNLKTNMARLSLSHSTESGIKEPKTTTSLTTTEGLSSTPQRAGIAFSS
KNLPASSGPPSYVSTPKKNSTSRVQPTTNFQTLSRPDIVPVIVPRSNSLRPETT
SDVKKEMNNFGRVVPSTVSTKSTDVIKSGSNRDESDKIDSINQKRMTGNDKTDL
NIARAEQHVSSRLDNTNTSSVVCDGNQPAARWIGAAKFRRNSPVDPVVSPHDRS
PTFPWSATDDGVTCQPDRQVTAPELSKRVVEPGRARALVASWETREKALTADTP
VLVSGRPPTSPGVDMNSFIPRGSHGTSESDLTVSDDNSAIEELMQQHNAFTSIL
QARLTKLQVIRRFWQRNDLKGAIDATGKMGDHSVSADVISVLIERSEIFTLDIC
TVILPLLTRLLQSETDRHLTVAMETLLVLVKTFGDVIRATISATPTIGVDLQAE
QRLERCNLCYVELENIKQILVPLIRRGGAVAKSAQELSLALQEV
472 WD40 repeat MSTLEIEARDVIKIVLQFCKENSLHQTFQTLQNECQVSLNTVDSLETFVADINS 81 1622
protein GRWDVILPQVAQLKLPRKKLEDLYEQIVLEMIELRELDTARAILRQTQAMGFMK
QEQPERYLRLEHLLVRTYFDPREAYHESSKEKRRSQIAQALASEVTVVPPSRLM
ALIGQSLKWQQHQGLLPPGTQFDLFRGTAAVKADEEEMYPTTLAHTIKFGKQSH
PECARFSPDGQYLVSCSVDGFIEVWDYISGKLKKDLQYQADDSFMMHDDAVLCV
DFSRDSEMLASGSQDGKIKVWRIRTGQCLRRLERAHSQGVTSLSFSRDGSQLLS
TSFDSTARIHGLKSGKALKEFRGHTSYVNDAIFTSDGGRVITASSDCTVKVWDV
KTTDCIQTFKPPPPLKGGDVSVNSVHLFPKNSEHIVVCNKASSIYIMTLQGQVV
KSFSSGKREGGDFVAACISPKGEWIYCVGEDRNIYCFSQQSGKLEHLMKAHDKD
IIGVTPHPHRNLLVTYSEDSTMKIWKP
473 WD40 repeat MDIELEDQPFDLDFHPSAPIVAVALITGRLQLFRYVDISSEPERLWTVTAHTES 399 1460
protein CRAARFINAGSSVLTASPDCSILATNVETGQPVARLDNAHGAAINCLTNLTEST
IASGDENGIIKVWDTRQNSCCNKFKAHEDYISDMEFVPDTMQLLGTSGDGTLSV
CNLRKNKVHARSEFSEDELLSVALMKNGKKVVCGSQEGVLLLYSWGYFKDCSDR
FVGHPHSVDALLKLDEDTVLTGSSDGIIRVVSILPNKMIGVIGEHSSYPIERLA
FSHDRNVLGSASHDQILKLWDIHYLHEDDEPETNKQEAVNDENVDMDLDVDTEK
RPRGSKRKKRAEKGQTSSQKQSSDFFADI
474 WD40 repeat MDRIQQIPHTCVARKINLPLGMSKESLALNLPANLAPTMSPPSITYSDRFIPSR 207 1673
protein KASNFEEFALPDKTSPSPNSAGGQSSSTNGEGRDDACAAYSALLRTELFPATPD
KTEGCRRPVIGSPSGNVFRFKSQQCKSQSPFSLCPVGEDGDLSETGAVARKTTR
KIPRSPFKVLDAPALQDDFYLNLVDWSSHNILAVGLSACVYLWSASSSKVTKLC
DLGLDDNVCSVAWTQRGTYLAVGTNNGGVQIWDAAHCKQVRTMEGHCTRVGTLA
WNSHILSSGGRDRNILQRDIRAQDDFVSKFSGHKSEVCGLKWSYDNRELASGGN
DNQLFVWNQQSQQPVLKYNEHTAAVKAIAWSPHQHGLLASGGGTADRCIRFWNT
ATNTSLNCVDTGSQVCNLVWSKNVNELVSTHGYSQNQIIVWRYPTMSKLATLTG
HTLRVLYLAISPDGQTIVTGAGDETLRFWNVFPSSKTQQNTIRDMGVWSSGRTH
IR
475 WD40 repeat MAGGQGEGEEKVDKLSMELTEDVMKSMEIGAVFKDYNGKINSLDFHRTNNYLVT 263 1309
protein ASDDEAIRLFDTASATWQKTSYSKKYGVDLICFTNHQTSVLYSSKNGWDESLRH
LSLMDNKYLRYFKGHHDRVVSLCMSPKGECFMSGSLDRTVLLWDLRIDKCQGLI
RVRGRPAVAYDEQGLVFAISNEGGLIKMFDARLYDKGPFDTFVVEGDKSEASGI
KFSNDGKLILLSTMDSNIHVLDAYQGTTVHSFSVEAVPNGGEAVPNGGTLEASF
SPDGKFVISGSGNGNIHAWSVNSGKEVACWTTEGVIPAVVKWAPRRLMFASGSS
VLSLWVPDLSKLASLTGSNSNSAY
476 WD40 repeat MHRVGSTGNTSNSSRPRREKRLTYVLNDANDSRHCSGINCLVISKLSLLGGNDY 232 2529
protein LFSGSRDGTLKRWELADDSAVCSATFESHVDWVNDAVLTGETLVSCSSDTTLKT
WRPFSDGVCTRTLRQHSDYVTCLAAASKNSNIVASGGLGREVFIWDIEAAMAPV
SRTSEAMDDDTSNGVLSSGNSVLSTTVRSTNATNSASLHTSQLQGYTPIAAKGH
KESVYALAMNDVGTLLVSGGTEKVVRVWDPRSGAKQMKLRGHTDNVRALILDST
GRFCLSGSSDSIIRLWDLGQQRCVHSYAVHTDSVWALASTPNFSHVYSGGRDLS
LYLTDLTTRESLLLCMEKHPLLRLTLQDDSIWVATTDSSLHRWPAEGQNPPKMF
QRGGSFLAGNLSFTRARACLEGSAPVPVNTQPSFVIPGSPGIVQHEILNNRRHV
LTKDAEGTVKLWEITRGAVLDDYGKVSFEEKKEELFEMVSIPAWFTMDTRLGSM
SVHLDTPQCFTAEMYAVDLNVPDAPEEQKINLAQETLRGLLAHWLSRRRQRLAT
QASANGDFPAGQENALRNHISSRIDVHDDAETHIAGILPAFDFSTTSPPSIITE
GSQGGPWRKKITDLDGTEDEKDFPWWCLECVLHGRLSPRESLKCSFYLHPYEGT
TVQVLTQGKLSAPRILRIQKVINYVLEKMVLDRPLDSSNSETTFTPGLSGNQSH
AAVVGDGSLRSGARVWQQKAKPLVEILCNNQVLSPDMSLATVRTYIWKKPDDLY
LYYRLVQNR
477 WD40 repeat MMKGKTIQMQAAHQNHDGETSVACVLWDWHAKHLITAGADNTILIHSYPSSSSS 56 2950
protein KPITLRHHKNAVTALAINSNVRSLASGSVDHSVKLYSYPGGEFQSNVTRFTLPI
RSLAFNKSGELLAAAGDDEGIKLISTIDNSIARVLKGHNGPVTSISFDPKNEFL
ASSDSDGTVIYWELSTGKPVHTLKKIAPNTTSNPTSLNQISWRPDGEMLAVPGR
KSEVSMYDRDTAEKLFSLKGGHSDTICSLAWSPNGKYIATAGTDRQVMVWDADR
RQDIDKQRFDNPICSVAWKPSDNALAVIDVLGRFGVWESPIASHMKSPADGAER
YDNMEDEEPLMARYEEELEDSVSGSLNEIINDDDDDDEMGKIPRKILQKKPSVK
VEKGKEESNAKAFKSGQDSFKLKSAMQEAFQPGATQRQSGKRNFLAYNMLGSVI
TFDNDGFSHIEVDFHDIGKGCRVPSMTDYFGFTMASLSESGSVFGSPQKGEKNP
STLMYRPFSSWANNSEWSMRFPMGEEVKAVALGSGWVAAVTSLNFLRVFSEGGL
QKFVLSMDGPVVTAAGYENLLVVVSHASNPLLSGDQVLSFTVYDISQKTCPLSG
RLPLSPGSHLTWLGFSEEGLLSSYDSEGNLRVFTNDYNGCWVPIFSAARERKSE
TESIWMVGLNSTQVFCVVCKLPDTYPQVAPKPVLSVLNLSLPLACSDLGADDLE
NEYLRGSLLLSQMQKKAEDAVACGRESNMEEDSIFKMEAALDRCLLRLIANCCK
GDKLVRATELARLLSLEKSLQGAIKLVSAMKLPMLAERFNTILEEKILQENMET
ISCRRLTSEAQDMDTPISISVKQVSYGANLGDSPFLPNRQVEPKHSTPVFSKPD
TKIEVDTSEAIAKGCDAQNGNIKSGDAEVQPASHNDSIQKPSNPFAKASNTSAN
QAVQRNASLLSSIKQMKTATENEGKRKERARSGSLPQKPAKQSKIS
478 WD40 repeat MKQKRKGHQVDDPKYSVQTPQEDDTPNESGPASEEVESSDEEGGNSSNIEDDII 193 2577
protein YSSSEEDPVVSSDYEEDEDAESDAEGVTAEQELEGDIDNALQNYMGTLTVLSNF
HGENLKNAEGEDTSGDDDDEEEMPKRAEESDSPEDENDERPKRAEESDFSEDED
EERPKRAEESDSSEDEVPSRNTVGDVPLRWYKDEQHIGYDIKGKKIKKQPKKDQ
LDSFLASTDDSSDWRKVYDEYNDEEVELTKDEIKFISRLRKGTIPHADVNPYEP
YVDWFDWKDKGHPLSNAPEPKRRFIPSKWEAKKVVKLVRAIRKGWITFQKAEEK
PRFYLMWGDDLKPSEKMANGLSYIPAPKPKLPGHEESYNPPPEYIPTQEEINSY
QLMYEEDRPKFIPKRFDSLRNVPAYDRFLSEIFERCLDLYLCPRTRKKRINIDP
ESLIPKLPKPKDLQPFPSICFLEYKGHTGAVSCISPESSGQWLASGSKDGTVRI
WEVETARCLKVWDIGRPIQHIAWNPVSQLSILAVAVDEEVLVLNTGLGSEDSQE
KVAELLHVKSKPVSADDLGDNTSLTKWIKHEKFDGIKLTHLKPVHLISWHHKGD
YFATVAPDGNTRAVLVHQLSKQQTQNPFKKMQGRVVHVLFHPSRAIFFVATKTH
VRVYDLVKQQLVKRLVTGLHEVSSMAVHHKGDNLLVGSKEGKVCWFDMDLSTQP
YKTLKNHSKDIHSVAFHDSYPLFASCSDDCKAYVFYGLVYSDLLQNPLIVPLKV
LQGHQSVNGMGVLDCQFHPKQPWLFTAGADSVVKLYCN
479 WD40 repeat MMSLKRGFEESLVPAKRQKTELSTVTYGDGPRRTSSLESPIMLLTGHHAAIYTM 187 1233
protein KFNPTGTVIASGSHEREIFLWNVHGDCKNFMVLKGHKNAVLDLHWTTDGCQIIS
ASPDKTLRAWDVETGKQIKKMAEHSSFVNSCCPSRRGPPLVVSGSDDGTAKLWD
LRHRGAIQTFPDKYQITAVGFSDAADKIYSGGIDNEIKVWDLRRGEVTMRLQGH
TDTITGMQLSSDGSYLLTNSMDCSLRIWDMRPYAPQNRCVKILTGHQHNFEKNL
LKCSWSSDGSKVTAGSADRMVYIWDTTTRRILYKLPGHTGSVNETGFHPTQPII
GSCSSDKQIYLGEIEPNVGYQAVI
480 WD40 repeat MEFSDTYKHTGPCCFSPDARYLAIAVDYRLVIRDVVTLKVVQLYSCMDKISNIE 51 1436
protein WALDSEYILCGLYKRAMVQAWSLSQPEWTCKIDEGPAGIAHARWSPDSRHIITT
SDFQLRLTVWSLVNTACIHIQWPKHASKGVSFTQDGKFAAIATRRDCKDYVNLL
SCHTWEVMGTFTVDTIDLADLEWSPNDSAIVVWDSPLEYKVLIYSPDGRCLFKY
QAYDSWLGVKTVAWSPCSQFLAVGSYDQTLRTLNHLTWKPFAEFVHVSTVRGPA
SAVVFKEVEEPWNLDVSGLHLNDDNAHDIQDGKPAEGHSRVRYKVVEFPVNVSS
QKHPVDKPNPKQGIGLLAWSRDSQYLFTRNDNMPTALWIWDICRLELAALLIQK
EPIRAAAWDPVYPRVALCTGSSHLYMWTPSGACCVNIPLPQFVVSDLKWNPDGT
SMLLKDRESFCCTFVPMLPEFNDDETNEE
481 WD40 repeat MAKLIETHSCVPSTERGRGILIAGDAKTNSIIYCNGRSVIMRNLDNPLEASVYG 525 2351
protein EHSYPATVARFSPNGEWVASGDTSGTVRIWGRGSDHTLKYEYKALAGRIDDLEW
SADGQRIVVCGDSKGKSMVRAFMWDSGTNVGEFDGHSRRVLSCSFKPTRPFRVA
TCGEDFLVNFYEGPPFRFKTSHRDHSNYVNCVRFAPDGSKFITVGSDRKGVIFD
GKMGEKIGELSKEGGHTGSIYAASWSPDSKQVLTVSADKSAKIWEISETGNGTV
KKTLTFGSQGGADDMLVGCLWLNDYLITVSLGGIVSLLSAVDPDKPPKTISGHM
KSINAIALSLQSGQSEVCSSSYDGVIVRWILGVGYAGRVERKDSTQIKCLATIE
GELVTCGFDNKVRRVPLLSEQHKESEPIDIGAQPKDLDVAVGCPELTFVSTDAG
IIIIRASKIVSTTNVGYAVTAAAISPDGTEAVVGGQDGKLRVYSIKGDTLLEES
VLERHRGPINAIRFSPDGSMFASGDLNREAVVWDRITREVKLKNMVYHTARINC
IAWSPDSSKVATGSLDTCILIYEVGKPASSRITIKGAHLGGVYGLAFSDQSTVI
SAGEDACVRVWSLP
482 WD40 repeat MPQPSVILATAGYDHTVRFWEATSGRCYRTLQYPDSQVNHLEITPDKQYLAAAG 152 1099
protein NPHIRLFEVNSNNPQPVISYDSHTNNVTAVGFQCDGKWMYSGSEDGTVKIWDLR
APGFQREYESRAAVNTVVLHPNQTELISGDQNGNIRVWDLNANSCSCELVPEDT
AVRSLTVMWDGSLVVAANNHGTCYVWRLMRGTQTMTNFEPLHKLQAHNSYILKC
LLSPEFCEHHRYLATTSSDQTVKIWNVDGFTLERTLTGHQRWVWDCVFSVDGAF
LVTASSDSTARLWDLSTGEAIRTYQGHHKATVCCALHDGTDGASC
483 WD40 repeat MLTKFETKSNRVKGLSFHPKRPWILASLHSGVIQLWDYRMGTLIDKFDEHDGPV 470 4114
protein RGVHFHKTQPLFVSGGDDYKIKVWNYKMRQCLFTFVGHLDYIRTVHFHNEYPWI
VSASDDQTIRLWNWQSRVCISVLTGHNHYVMSASFHPKEDLVVSASLDQTVRVW
DISGLRKKTVSPADDLSRLAQMNTDLFGGGDVVVKYVLEGHDRGVNWAAFGTSL
PLIVSGADRQVGKLWRMNDTKAWEVDTLRGHTNNVSCVIFHARQDIIVSNSEDK
SIRVWDMSKRTSVQTFRREHDRFWILAAHPEMNLLAAGHDSGMIVFKLERERPA
YVVYGGSLLYVKDRYLRTYEFATQKDNPLIPIRKPGSIGPNQGPRSLSYSPTEN
AILICSDADGGAYELYAVPKDSHGRSDTVQEAKKGLGGSAVGVARNRFAVLDKN
HNQVTIKNLKNEVTKKFDLPVTADALFYAGTGNLLCRSEDSVFLFDMQQRTVLG
EIQTPNVRYVVWSNDMENVALLSKHTIIIASKKLSSTCSLHETIRVKSGAWDDN
GIFMYSTLNHIKYCLPNGDSGIIKTLDVPVYITKVSGKSLYCLDRDGKNRVIQI
DITECLFKLALSKKKYDYVINMIRNSQLCGQAIIAYLQQKGFPEVALHFVRDER
TRFNLAVESGNIEIAVASAKEIDEKDHWYRLGVEALRQGNAGIVEYAYQRTKNF
ERLSFLYLITGNLDKLSKMLRIAEMKNDVMGQFHNALYLGDIQERIKILEESGH
LHLAYATASLHGLADIADRLAADLGGNIPVLPPGKKSSLLMPPAPILHGGDWPL
LRVTKGIFEGGLENSTSAAYEEEDEEAAADWGEDIDIENIEGENGEATVLDDQE
VKGGEDDEGGWDMEDLELPPDVAAANVGTNQKTLFVAPTLGMPVSQIWMQKSSL
AGEHAAAGNFETALRLLTRQLGIKNFSPLKPLFLELYMGSHTFLPSFASVPAFS
LALQRGWSESASPNIRGPPALVYRLSVLEEKLTVAYRATTEGRFSEALRLFLNI
LHTIPVIVVDSRKEIDEVKELIGIAKEYVLGLRMEVKRKEIRDDAVRQQELAAY
FTHCNLQKAHLKLALLNAMGISYRCKNYNTAANFARRLLETDPSSNHATKARQV
LQVCERNLQDATQLNYDFRNPFVVCGATFTPIYRGQKEVSCPYCMARFVPDIAG
KLCSICDLAIVGSDASGLFCFATQTR
484 WD40 repeat MDLLQNYQDDSEDSNPELRNHPPLEDATATSAPAGVENETSSSPDSSPLRLALP 196 2007
protein AKSCAPDVDETLMALGVPGSEKKNNHNKPIDPTQHSVTFNPSYDQLWAPLYGPA
HPYAKDGIAQGMRNHKLGFVEDSAIEPFMFDEQYNTFHRYGYAADPSASLGSHI
VGDLESLKKNDGASVYNLPKREHKRQKLEKKMIQKDENEEEEKEVGEEVDNPST
EEWLKKNRKSPWAGKKEGLQTELTEEQKKYAQEHAEKKGDREKGEKVEIVDKTT
FHGKEERDYQGRSWIDPPKDAKATNDHCYIPKRWVHTWSGHTKGVSAIRFFPKY
GHLLLSAGMDTKVKIWDVFNSGKCMRTYMGHSKAVRDISFSNDGSRFLSAGYDR
NIKLWDTETGKVISTFSTGKIPYVVKLHPDEDKQNVLLAGMSDKKIVQWDMNSG
EITQEYDQHLGAVNTITFVDNNRRFVTSSDDKSLRVWEFGIPVVIKYISEPHMH
SMPSISLHPNTNWLAAQSLDNQILIYSTRERFQLNKKKRFAGHIAAGYACQVNF
SPDGRFVMSGDGEGRCWFWDWKTCKVFRTLKCHDNVCIGCEWHPLEQSKVATCG
WDGMIKYWD
485 WD40 repeat MARKGLGTDPAIGSLMSSKKRKEYKVTNRFQEGKRPLYAIAFNFIDARYHNIFA 214 1323
protein TAGGTRVTIYQCLEGGAISVLQAYVDDDKDESFYTLSWACDVNGSPLLVAGGHN
GIIRVLDVANEKVHKSFVGHGDSVNEIRTQALKPSLILSASKDESVRLWNVQTG
ICILIFAGAGGHRNEVLSVDFHPSDVYRIASCGMDNTVKIWSMKEFWTYVEKSF
TWTDLPSKFPTKYVQFPVFIAAVHSNYVDCTRWLGNFILSKSVDNEVVLWEPYS
KEQSTSDGVVDILQKYPVPECDIWFIKFSCDFHYNSMAVGNREGKVYVWELQSS
PPNLIARLSHAHCKNPIRQTAISHDGSTILCCCDDGSMWRWDVVQ
486 WD40 repeat MESGAGGSVGARVPSAKPEMLQQPPYSNGDDDNDMERGTAPVPSSNPNTVSKWE 68 2146
protein LDKDFLCPICMQTMKDAFLTACGHSFCYMCIMTHLNNKSNCPCCSLYLTNNQLF
PNFLLNKLLKKTSACQMASTASPVENLCLSLQQGAEVSVKELDFLLTLLAEKKR
KMEQEEAETNMEILLDFLQRLRQQKQAELNEVQADLHYIKDDILALEKRRLELS
RARERYSRKLHMLLDDPMDTTLGHAAIDDGNNVRTAFVRGGQGDAISGKFQQKK
AEIKAQASSQGMQKRANFCHSDSQVLPTLSGLTIARKRRVLAQFDDLQECYLQK
RRRWATQLRKQCDGGLRKERDGNSISREGYHAGLEEFQSILTTFTRYSRLRVIS
ELRHGDLFHSANIVSSIEFDRDDELFATAGVSRRIKVFDFATVVNEPADVHCPV
VEMSTRSKLSCLSWNKCIKSQIASSDYEGIVTVWDVNTRQSVMMYEEHEKRAWS
VDFSRTEPTRLISGSDDGKVKVWCTRQETSVLNIDMKANICCVKYNPGSSYYVA
VGSADHHIHYYDLRNPSVPLYEFNGHRKTVSYVKFISTNELASASTDSTLRLWD
VRDNCLVRTFKGHTNEKNFVGLTVNSEYIACGSETNGVFVYHKAISKPAAWHQF
GSPDLDDSDDDTSHFISAVCWKSESPTMLAANSQGTIKVLVLAP
487 WD40 repeat MANYVDSKKNFKCVPALQQFYTGGPFRLSSDGSFLVCACNDEVKVVDLATGSVK 874 3705
protein NTLEGDSELIVALALTPDNKYLFSASRSTQIKRWDLSSATCKRTWKAHNGPVAD
MACDASGGLLATAGADRSILVWDVDGGYCTHSFRGHQGVVTTVIFHPDPHCLLL
FSGSDDATVRIWDLVAKKCISVLEKHFSTVTSLAISENGWNLLSAGRDKVVNIW
DLRDYHCRATIPTYEPLEAVCVLPTGSRLVSVMNQSRALPENRKKSGAAPVYFL
TVGERGIVRIWYSEGALCLYEQKSSDAIISSDKDELKGGFVSAVLLPLTQGVMC
VTADQRFLFYNLDESDEGKCDLKVSKRLIGYNEEIVDLKFLGDEEKFLAVATNL
EQVRMYDLSSMTCVYELSGHTDIVLCLDTVVFSGHSLLASGSKDHTVRIWDTES
KSCICVAAGHMGAVGAVAFSKKAKNFFVSGSSDRTIKVWSFASVLDFGGISKSI
KLSSQAAVAAHDKDINSVAVAPNDSLICTGSQDRTARIWRLPDLVPVLVLRGHK
RGVWCVEFSPVDQCVMTASGDKTIKIWALSDGSCLKTFEGHTASVLRASFLTRG
TQFVSSGADGLLKLWTIKSNECIATFDQHEDKIWAMAVGKKTEMLATGGSDSLV
NLWHDCTTTDEEEALLKEEEAALKDQELLNALADTDYVKAIQLAFELRRPYKLL
NVFTELYSKGHAQDQIQKVIRELGNEELRLLLEYVREWNTKPKFAHVAQFVLFQ
LFNVLPPKEIIEVQGISELLEGLIPYAQRHYSRIDRLMRSTFLLDYTLSSMSVL
SPTETDLSSSNLLARTADPLHAQIDQFHPTHFPEPNLTPIQSLLDSGNTDSVEV
TARRAKKKRVSGNDSEKTTVAEVKIGDMENAFDEPDVADQGSSRKHKPASSKKR
KSIAVGNASIKRIASGNAVTIALQV
488 WD40 repeat MESSCSSMNSNRHSTEKRCLRPLQKQGASMNKHSSDRFIPARGSIDLDVARFMV 360 1754
protein TQKQKDNNDIHALSPSPSPSKKAYQKEMADTLLKNAGAADNNCRILSFNGKSST
VSQGSQENVLANLSISRRARRYIPQSADRTLDAPDLLDDYYLNLLDWSSTNVLS
TALGNTVYLWDASNSSISELLIADEEEGPVTSVSWAPDGSQIAVGLNNSVVQLW
DSQSNKKLRALKGHHDRVGALSWNGPILTTGGLDGIIINHDVRTRDHIVQTYKG
HTQEVCGLKWSPSGQQLASGGNDNLLYIWDKSMASHNPSSQYFHQLDEHCAAVK
ALAWCPFQTNLLASGGGTSDGSIKFWNTQTGACLNTVDTHSQVCSLLWNRHERE
LLSSHGLNQNQLTLWKYPSMVKITELTGHTARVLHMAQSPDGYTVASAAADETL
KFWQVFGAPDASKKTKTKDTKGAFNMFHMHIR
489 WD40 repeat MLDEIVADEEEEFNIWKKNTPLLYDVVITHALEWPSLTVQWLPDRHQSPTKDYS 185 1384
protein LQKMIVGTHTSGDEPNYLMIAEVQMPLQYSEDGNVGGFESTEAKVHIIQQINHE
GEVNRAQYMPQNSFIIATKTVSSDVYVFDYTKHSSNAPQERVCNPELILKGHTN
EGYSLSWSPLKEGQLLSGSNDAQICFWDINAASGRKVVEAKQIFKVHEGAVEDV
SWHLKHEYLFGSVGDDCHLLIWDTRTAAPNKPQHSVVAHESEVNSLAFNPFNEW
LLATGSADKTVKLFKLRKLSCSLFTFSNHTEEVFQIEWSPMNETILASSGGDRR
LMVWDLRRIGDEQTSEDAEDGPPELIFIHGGHTSKISDFSWNLHDDWLIASSAE
DNILQIWQMAENIYHDDADIL
490 WD40 repeat MTKEDHGESRDEMGERMVNEEYKLWKKNTPFLYDLVITHALEWPSLTVQWLPPS 241 1533
protein CKQQQDIIKDDDIDHPNTQMVILGTHTSDNEPNYLILAEVQLHDGTEDEDGDGD
VKRPQDKMKPGTSGGAMGKVRILQQINHQKEVNRARYMPQKPTIIATKTVNADV
YVFDYSKHPSKPPQEGRCNPELRLQGHESEGYGLSWSPLKEGHLLSASDDAQIC
LWDITAATKAPKVVEANQIFRYHDGPVEDVAWHAIHDHLFGSVGDDHHLLLWDI
RNDSEKPLHIVEAHQAEVNCLAFNPFNEWIVATGSADRTVALHDIRKLDKVLHT
CAHHMEEVFQIGWSPQNGAILASCGSDRRLMVWDLSRIGDEQNPEDAEEAPPEL
LFIHGGHTSKISDFSWNPAEEWVIASVAEDNILQVWQMSEHIYNDDNDSPTA
491 WD40 repeat MAMAMGDENAADPVEEFNIWKKNTPFLYDLVITHALEWPSLTVQWLPDRHQSST 230 1435
protein ADYSLQKMIVGTHTSEDEPNYLMIAEVQIPLQNSSEDNIIGGFESTEAKVQIIQK
INHEGEVNKARYMPQNSFVIATKTVSSDVYVFDYSKHPSKAPQERVCNPELILK
GHSNEGYGLSWSPLKEGYLLSGSNDAQICLWDINAAFGKKVLEANQIFKVHEGA
VGDVSWHLKHEYLFGSVGDDCHLLIWDMRTAAPNKPQQSVIAHQSEVNSLAFNP
FNEWLLATGSMDKTVKLFDLRKLSCSLHTFSNHSTDQVFQIEWSPMNETILASSG
ADRRLMVWDLARIGETPEDEEDGPPELLFVHGGHTSKISDFSWNLNDDRVIASV
AEDNILQIWQMAENIYHDDEDML
492 WD40 repeat MGLFEPFRALGYITDGVPFAVQRRGIETFVTLSVGKAWQIYNCAKLIPVLVGPQ 101 2857
protein MDKKIRALACWRDFTFAATGHDIAVFRRAHQVATWSGHKAKVTLLLSFGQHVLS
VDLEGCLFIWAVAEVNQNKPPIGQIQLGEKFSPSCIMHPDTYLNKVLIGSEEGT
LQLWNVNTRKKLYEFKGWGSSIRCCVSSPALDVVGIGCSDGKIHVHNLRYDEEI
VTFMHSTRGAVTALSFRTDGQPLLAAGGSSGVISIWNLEKKKLQSVIKDAHDSS
VCSLHFFANEPVLMSSATDNSIKMWIFDTTDGEARLLKYRSGHSAPPMCIRYYG
KGRHILSAGQDRAFRIFSVIQDQQSRELSQGHVGKRAKKLKVKDEEIKLPPVIA
FDAAEIRERDWCNVVTCHLDDPCAYTWRLQNFVIGEHILKPCLEDPTPVKSCSI
SACGNFAVLGTEGGWLERFNLQSGISRGTYIDIGEKRQCAHNGAVVGLACDATN
TLLISGGYNGDIKVWDFKGRELKFRWEIEVPLIKIVYHPGNGILATAADDMILR
LFDVTAMRLVRIFVGHMDRVTDLCFSGDGKWLLSSSMDGTIRVWDIISSRQLNA
MHMDSAVTALSLSPGMDMLATTHVGHNGIYLWANRMIYSKATDIEPFISGKQVV
KVSMPTVSSKRESEEGDEKRTIVAESNVNKSDVSGSLIGDSYSAQLTPELVTLA
LLPKAQWQSLVNLDIIKMRNKPIEPPKKPEKAPFFLPSLPTLSGERIFIPSSMN
GDGDQDETRNDKTVFEARGKKLGGESLSFMQLLQSCAKIKDFTTFTNYLKGLSP
SAVDMELRLLQIVDNENISETEHSVELQGIGMLLDYFVNEVSCNNNFEFVQALI
RLFLKIHGETIRCQVSLQEKARKLLEIQSSTWERLDTSFQNARCMITFLSSSQF
493 WD40 repeat MIAAVCWVPKGVAKVLPDSAEPPTQEEIQELLKCNVVAESDDNEDSDEESEEMD 43 1548
protein TETDKNTDAVAKALAAANALGSQSSDFQRQHKVDDIANGLKELDMDHYDDEDEG
IDIFGSGSLGNCYYPANDMDPYLVEQDDDDEDEIEDMTIKPSDLIILSARNEDD
VSHLEVWIYEEETEEGGSNMYVHHDIILPAFPLSLAWLDCNLKGGEKGNFVAVG
TMQPEIELWDLDVLDEVEPAVVLGGAVKDEASGKTTKLKKKKKNKQAVNFKEGS
HTDAVLGLAWNMEYRNVLASASADKSVKIWDIVAEKCEHTMQPHTDKVQAVAWN
PNQATVLLSGSFDRSVIMMDMRAPTHSGIRWPVPADVESLAWDPHTDHSFMVSA
EDGTVRGFDIRAAASTADFDGKPMFILHAHDKAVCAISYNPAAPSLLTTGSTDK
MVKLWDITNNQPSCIASTNPNVGAVFSAAFSKNSPFLLATGGSKGILHVWDTLD
NSEVARRFGKFRPQN
494 WEE1-like MIMDENEFCDIFSLRKRLCLLSSQEGEEEEELEAMSQLDAGEFTVTGNEEVVAI 206 1657
protein AEDDVNTGILSQDLFSSQDYCTPSQPQDSTDLDSKDKAPCPLSPVKSTIQRKRC
RPELLSNPPDSIQFSFQRLERVRSEESIQSSSQQLARVRSEVSSSDDFKTPKIT
ASGQKNYVSQSALALRARVMSPPCIKNPYLDENEELNEKIQRSTRRSPACVTPI
QSGACLSRYRADFHELEEIGRGNFSRVYKALNRLDGCCYAVKCSQSELRLDTER
KVALMEVQSLAALGPHKNIVGYHTAWFENDHLYIQMELCDHNLTTANDRGILRT
DTDFLEAVYQIAQALEFIHGRGVAHLDVKPENIYVRDGTYKLGDFGRATLINGT
LHVEEGDARYMSREILNDNYEHLDKVDMFSLGATFFELLMRKQYPGSGKRIDRD
TEIKIPILPGFSIYFQKLLQDLVSNDPGKRPSAKDVLKNPIFNKVRGAKEV
495 WD40 repeat MLAPALEMEPVEPQSLKKLSFKSLKRALDLFSPVHGQIAPPDPESKKMRISYSL 117 1580
protein NFEYGGGSGSEDQVPKRKESGAAQNQGQQAAGASNALALPGPEGSKIPPMEKSQ
NALTVGPSLRPQGLNDVGLHGKGTAIISASGSSDRNLSTASAIMERLPSRWPRPV
WHPPWKNYRVISGHLGWVRSIAFDPSNQWFCTGSADRTIKIWDLASGRLKLTLT
GHIEQIRGLAVSSKHTYMFSAGDDKQVKCWDLEQNKVIRSYHGHLSGRLKLTLT
PTIDILLTGGRDSVCRVWDIRSKMQIFALSGHDNTVCSVFARPTDPQVVTGSHD
TTIKFWDLRHGKTMTTLTNHKKSVRAMAQHPKENCFASASADNIKKFQLPRGEF
LHNMLSQQKTIINTMAVNEEGVMATGGDNGSLWFWDWKSGHNGQQAHTIVQPGS
LESEAGIYALSYDLTGSRLVSCEADKTIKMWKEDELATPETHPLNFKPPKDIRRF
496 WD40 repeat MEEAAKEQSAGSGKPKLLRYGLRSAAKPKEDKKEEQLHQPPPPPPPQQQAAPAP 111 1700
protein APAATRSSTSGSAGGRDRRPQQQHAVDEKYARWKSLVPVLYDWLANHNLLWPSL
SCRWGPQLEQATYKNRQRLYISEQTDGSVPNTLVIANCEVVKPRVAAAEHVSQF
NEEARSPFIRKYKTIIHPGEVNRIRELPQNPNIVATHTDSPDVLIWDVESQPNR
HAVYGATASRPNLILTGHQENAEFALAMCPAEPFVLSGGKDKTVVLWSIQDHIT
ASATDQTTNKSPGSGGSIIKKTGEGNEETGNGPSVGPRGIYCGHEDTVEDVAFC
PSTAQEFCSVGDDSCLILWDARIGTNPVAKVEKAHNGDLHCVDWNPHDNNLILT
GSADNSVNMFDRRNLTSNGVGSPVYKFEGHKAAVLCVQWSPDKPSVFGSSAEDG
LLNIWDYERVDKKVDRAPNAPAGLFFQHAGHRDKIVDFHWNTADPWTMVSVSDD
CDTAGGGGTLQIWRMSDLIYRPEEEVLAELENGKAHVLECSKA
497 WD40 repeat MAKDEEEFRGEMEERLVNEEYKIWKKNTPFLYDLVITHALEWPSLTVQWLPDRE 144 1412
protein EPPGKDYSVQKMILGTHTSDNEPNYLMLAQVQKOKEDAENDARQYDDERGEIGG
EGCANGKVQVTQQTNHDGEVNEARYYIPQNPETTATKTVSAEVYVEDYSKHPSKP
PQDGGCHPDLRLRGHNTEGYGLSWSPFKHGHLLSGSDDAQICLWDINVPAKNKV
LEAQQIFKVHEGVVEDVAWHLRHEYLFGSVGDDRHLLIWDLRTSATNKPLHSVV
AHQGEVNCLAFNPFNEWVLATGSADRTVKLFDLRKISSALHTFSCHKEEVFQIG
WSPKNETILASCSADRRLMVWDLSRIDEFQTPEDALDGPPELLFIHGGHTSKIS
DFSWNPCEDWVIASVAEDNILQIWQMAENIYHDEEDDMPPEEVV
498 Cyclin- MGKYMRKGKGVGEVAVMEVSQGSLGVRTRARTLAAASSQKDHRRLGASKSVTTK 793 1683
dependant HQSSAPPASPCVESSMHTCYLELRSRKLEKFSRCYHSAHGATSHGESKRSLSLS
kinase EPSRLAVSEEARVASDKSSHRVLQQQSSVAHSRNNASATFSHNAKPAKAAQRKER
inhibitor RDDDHTSARPSEAPHEDEDGMEVEASFGENVMDLDSRERRTRETTPSSYTRDVE
TMETPGSTTRPPSNAGRRRFQTEGGHGTRNQFHVPTTNEIEEFFAGAEQQEQRR
FTDRYNYDPVSDSPLPGRFEWVRLRP
499 CDK type D MQNMEENVQSSWSLHGNKEICARYEILKRVSSGTYLDVYRGRRKEDGLIVALKE 415 2196
VHDYQSSWREIEALQRLCGCPNVVRLYEVILEFLTSDLYSVIKSAKKNKGENGIP
EAEVKAWMIQILQGLANCHANWVIHRDLKPSNMLISAYGILKLADFGSMSFLKR
AIYEVEYELPQEDILADAPGERLMDEDDSVKGVWNEGEEDSSTAVETNFDDMAE
TANLDLSWKNEGDMVMQGFTSGVGTRWYRAPDFLYGATIYGKEIDLWSLGCILG
ELLILEPLFSGTSNIDOLSRLVKVLGLQQKKNWPGCSNLPDYRKLCFPGDGSPV
GLKNHVPNCSDNMFSILERLVCYDPAARLNAKEIVENKYFVEDPYPVLTHELRV
PSPLREENNFSEDWAKWKDMEVDSDLENIDEFNVVHSSDGFCIKFS
500 Histone MAPVKRIEPEKTKANEGKPKRRKVAFAIDTGIEANDCISLHLVSTPEEMRDAEG 109 1653
acetyltransferase VEDQSLSFNPEYMQHFVGEHGKIYGYKGLKIDVWLNALSFHAYVDIQYESKVEE
GKSEKEATDLTDIMKRIFGRGLVEDRNAFIQSFSSNSQSIESMIHNEGERIATR
EILTDKGLSAQGDSERLGVSNEIFRLELSDPQIREWHARLEPLVLLFVEGSQPI
EQDDPKWEMYIRVQRESLSGGSAVCRLLGFCTVYRFYHYPDTTRLRISQILVFP
PYQGKGHGLLLLEAVNKTAVSRDSYDVTVEEPSESLQELRDCMDTIRISQILVFP
MPAVKSAVQKLKEANPSDKGAADHCLEGNVNNETVTTSSTKPKNKSGWFPPPGL
VEEVRKHLKISKKQFKRCWEILLYLNLDRSDSQCEDKYHISLMEQIMSELFDKS
SEKSAKGKRVIDIDNEYDNSKTFIMVRTRNPGNGEGFLPEALEGGMEVSQEDQL
KSLFEERLEEIAQIAEKVPSLCKALQMP
501 Histone MPEDRKKILEALAAKRKAEAESGEKKPRQKSSLNPAKPVSKPVSKPVGGIGSKG 343 1023
deacetylase KSTSAPISSTKAKSKHKEEVKAKRVTKMDRYETDEDDESEEEEDLDSESDDDEL
SDEDSEDDIKSKSVKKLPPQSKGKAPVKGISSSNGKGRDEKGKGIMKDKGKAKA
KVEESSSDAEGDSDDDGGDLSDDPLQEVDPSNILPSKTRREASQPTNYQFANMS
GDDDDDDDSD
502 Histone MADVPESLQQEKDEQGTDKNCCDGKFQKEIDIDDMEEEYNESSIDDEEENLSDN 417 2351
deacetylase VATNNMGTIPQGQACMAVTVEGIEHANSVGCGRNGREGSEEVTAAEDMGHVSIE
NIREQGRNRKSSEQLLALYEQEGLLEDDEDDDDVDWEPFEGVTVQMKWYCTNCT
MANSDDSVHCDSCGEHRNSDILRQGFLASPYLPAESPSSSDVPDERLEESKCVM
TTLTPSISPMIGVCCSSLQSERRTVVGFDERMLLHSEIQMETYPHPERPDRLRA
IAASLRAAGLFPGKCFSIPAREATCEELQTIHSLEHVNAVESTSCGMLSHLSPD
TYANEHSSLAARLAAGLCADLAKAIMTGQAQNGFALVRPPGHHAGVKDSMGFCL
HNNAAIAVSASRVVGAKKVLIVDWDVHHGNGTQEIFEADQSVLYISLHRHGEGF
YPGSGAVTEVGSSKGEGYSVNIPWKCGGVGDNDYIFAFQHAVLPIAEQFEPDLT
IISAGFDAAKGDPLGRCEVTPDGFAHMAQMLSCLSKGKMLVILEGGYNLRSISA
SATAVIKVLLGDNPKALPIDIQPSKGGLQTLLEVFEIQSKYWSSLKGHDQKLRS
QWEAQYGSKKRKVIRKRHMHIVGGPVWWKWGRKRVVYYHWFARVSSRKHL
503 Peptidylprolyl MASGAGAAGVVEWHQKPPNPKNPVVFFDVTIGTIPAGRIKMELFADIVPRTAEN 69 641
isomerase FRQFCTGEYRKAGIPIGYKGCHFHRVIKDFMIQAGDFVKGDGSGCISIYGSKFE
DENFIAKHTGPGLLSMANSGPNTNGCQFFLTCAKCDWLDNKHVVFGRVLGEGLL
VLRKIENVQTGQHNRPKLPCVIAECGEM
504 Peptidylprolyl MAKLVSSVCAFSCQQRHPHSRPRFLSNRDHYNHYHNHSHYHNVCYFPPMMMMQQ 172 1623
isomerase QLQKQKRMTTKTITSLFKCNSSNHTLLKGLKEFMGFKFRLQAAMLSCEMSILGR
VFAIFFIVHQAAAPFPFNHFDNWLVPPASAVLYSPNTKVPRTGEVALRKSIPAN
PAMKSIQDFLEDIYYLLRFPQRKPYGTMEGDVKSALQIAINEKDSILGSVPLDM
KERGLQLYNFLIDGQGGLQVLIEYIKEKDPDKVSVNLSSSLDTIAQLELLQAPG
LPYLLPEEYQQYPRLNGRATIEFTMEKGDNSMFSVSSGGGLQKTATIQVVLDGY
SAPLTAGNFTKLVIDGAYNGLKLKTTEQAVISDNERAEAGFNLPIEILPAGGFE
PLYRTTLSVQDGELPVLPLSVYGAIAMAHNTISEDYSSPSQFFFYLYDKRNAGL
GGLSFDEGQFSVFGYTTVGKEILPQLKTGDIIKSAKLVDGFDHLVLPSSST
505 WD40 repeat MDHYYQDDFDYLVDDEMVDFADDVEDDVRTRRRSDIDSDSENDFDSNNKSPDTT 231 1768
protein ALQAKRGKDIQGIPWNRLNFTREKYRETRLQQYKNYENLPRPRRSRNLDKECTN
FERGSSFYDFRHNTRSVKATIVHFQLRNLVWATSKHNVYLMQNYSIMHWSSLKQ
KGEEVLNVAGPIIPSVKHPGSSPQGLTRVQVSAMSVKDNLVVAGGFQGELICKY
LDKPGVSFCTKISHDENGITNAVEIYNDASGATRLMTANNDLAVRVFDTEKFTV
LERFSFPWSVNHTSVSPDGKLVAVLGDNADCLLADCKTGKTVGTLRGHLDYSFA
AAWHPDGYILATGNQDTTCRLWDVRKLSSSLAVLKGRMGAIRSIRFSSDGRFMA
MAEPADFVHLYDTRQNYTKSQEIDLFGEIAGISFSPDTEAFFVGVADRTYGSLL
EFNRRRMNYYLDSIL
506 WD40 repeat MDCSGDEEEEQFFESLEEMLSPSDSGSEAADNETGCRNADARSKYEIWKRAPSS 376 2943
protein IQERRQRFLVRMGLANPSELGNQVNSTSAESTCSTETANIPNGIERLRENSGAV
LRTAGSSGRKTHCKNVINIGLREGSVRSSSSSNGTPDVGEDNGEFGGTIFSRSG
GTWECMCKIKNLDSGKEFVVDELGQDGLWNKLREVGTDRQLTMDEFERSLGLSP
LVQELMRRESGVAQADCNGVHHHDAEISSSKRRSWLKALKSAAYSMRRPKEDQS
NYDSERSGRRSGSFDVPWGKPQWTKVRHYRKRYKEFTALYMGQEIEAHEGSIWT
MKFSLDGRYLASAGQDCVIHVREVIESMRTFGADTPDLYASSAYFSMNGLQELV
PLSIEDHANKMKRGKIIGSKKSSNSDCIVLPNKVFQLSEEPVCSFHGHLLDVFD
LSWSPSQYLLSSSMDKTVRLWKLGHESCLKVFSHNDIVTCIQFNPVDERYFISG
SLDGKARIWSIPDRQVVDWSDLREMVTAVCYTPDGQGGLVGSIKGSCRFYNTSG
NKLQLENQLNVRSKKKKSSGKKITGFQFAPGGDSQKVLITSADSRVRVYNGSEL
VCKYKGFRNTCSQISASFAPNGQHFVCASEDSRVYIWNHESPRGSGARHEKSSW
SHEHFLSQGVSVAIPWSGMKLQPPVWNSPEFMLGQRHNLLSLQGGKDVGCQNGL
LSREAGEGQESETPLHYISQVSHSCGSQNMVDRDGQDDLSRYSACISDSRLSSF
MAFPESPGNPDDLNSKVFFSDSSSKGSATWPEEKLPPTRKQSRSNSTSSHYDTL
KTHLGNTIQGQSGASAAVAWGLVIVTAGHGGEIRSFQNYGLPVRL
507 WD40 repeat MPSIPAIGEFTVCEINRELLTTKDESDTQAKDAYAKILGLVFPPISFQIEEGFG 107 1498
protein SASRQQFDQDLDREDTIVTPSTSEGTNALQEGGLLLKGVSVLKNILASSFGPIF
SPNDTKVLKKVELLQGISWHRHKHILAFISGSNQVTVHDFQDPEWRESSLLVSE
SQRGIEALEWRPNGGTTLSVACRGGICIWSASYPGSVAPVRSGVASFLGTSTRG
SSVRWTLVDFLQIPGGKAVTALSWSPTGRLLASASREDSSFTIWDVAQGVGTPL
RRGLGGISLLKWSPTGDYLFSAKPNGTFYLWETNTWTLEQWSSSGGCVISATWG
PDGRMLFMAFSESTTLGSLHFAGRPPSLDAHLLPMELPEIGSITGGFGNIEKMA
WDGCGERLAVSYTGGDLMYVGLIAIYDTRRTPFISASLVGFIRGPGEQVKPLAF
AFHDKFKQGPLLSVCWSSGLCCTYPLIFRAH
508 WD40 repeat MEEENAKHTEETRQVQVRFTTKLQPALRVPTTSIAIPAHLTRYGLSDIVNTLLG 118 1425
protein NDKPQPFDFLVESELVRTSLEKLLLIKGISAEKILNIEYILAVVPPKQEEPSLH
DDWVSVVDGSYPNFIFSGSFDSIGRIWKGEGLCTHVLEGHRDAITSAAFIMPSD
SSDSFINLATASKDRTLRLWQFKPNEHMTNGKMVRPYKLLKGHTSSVQTVSACP
RRNLICSGSWDCSIKIWQTAGEMDIESNAGSVKKRKLEDSTEQIISQIEASRTL
EGHSQCVSSVVWLEKDTIYSASWDHSVRSWDVETGVNSLTVGCRKALHCLSIGG
EGSALIAAGGADSVLRIWDPRMPGTFTPILQLSSHKSWITACKWHPKSRHHLIS
ASHDGTLKLWDVRSKVPLTTLEAHKDKVLCADWWKEDCVISGGADSTLQIFSNL
NLT
509 WD40 repeat MNRLRSKRNHILELRLGQSEPEKEATLASNRSRGTNAPIVVEDDDDVVVSSPRS 186 797
protein FALARSSVSQRSSRIPIVNEEDLELRLGLAVTGRTSAEHNPRRRHGRVPPNKPI
VLCDDAGEADQSSSKKRRTGQQLSSDVQSDESKEVKLTCAICISTMEEETSTIC
GHIFCKKCITNAIHRWKRCPTCRKKLAINNIHRIYISSSTG
510 WD40 repeat MEEPPPPAVLPSSEDTSIVSSHSFVNAPPTVPVGLDASIPQISTPGINQPGLTI 387 2456
protein PVPPEAAPLTASLVAASAGMPPAVVPSFVRPAIVAHPSVMPPPSMPLAALPMPV
ASAVPVAAPHFPPSTPNDNSITPSMPVPTPIVASSSVPPSVTIPGIAPLPFIAP
IPVPSSRPVAPSPFMPPARPLGASVSVAMDVDNTDEQDQDADNKGESPSSSPDH
PEDPSAAEYEITEESRKVRERQEQAIQELLLRRRAYALAVPTNDSSVRARLRRL
NEPITLFGEREMERRDRLRALMAKLDAEGQLEKLMKVQEEEEAAANVDAEEVQE
MEGPQVYPFYTEGSQELLKARTEITKFSLPRAVSRLQRARRKREDPDEDEDEEL
KCVLQQSAQINMDCSEIGDDRPLSGCAFSSDGTLLATSAWSGVTKLWSVPNINK
VATLKGHTERVTDVAFSPTNCHLATACADRTAMLWNSEGVLMKTYEGHLDRLAR
LAFHPSGLYLGTASFDKTWRLWDVNTGIELLLQEGHSRSVYGIAFQCDGSLAAT
CGLDGLARIWDLRTGRSILALEGHVKPVLGIDFSPNGYHLATGSEDHTCRIWDL
RKRQSVYIIPAHSHLVSQVKFEPQEGYFLVTASYDSTAKVWSARDFKSIKVLAG
HEAKVTSVDITADGQYIATVSHDRTIKLWSSKNSTNDMNIG
511 WD40 repeat MKRAYKLQEFVAHASNVNCLKIGKKSSRVLVTGGEDHKVNMWAIGKPNAILSLS 359 2761
protein GHSSAVESVTFDSAEALVVAGAASGTIKLWDLEEAKIVRTLTGHRSNCISVDFH
PFGEFFASGSLDTNLKIWDIRRKGCIHTYKGHTRGVNSIRFSPDGRWVVSGGED
NIVKLWDLTAGKLMHDFKCHEGQIQCMDFHPQEFLLATGSADRTVKFWDLETFE
LIGSAGPETTGVRAMIFNPDGRTLLTGLHESLKVFSWEPLRCYDAVDVGWSKLA
DLNIHEGKLLGCSYNQSCVGVWVVDISRVGPYAAGNVSRTNGHNEAKLASSGHP
SVQQLDNNLKTNMARLSLSHSTESGIKEPKTTTSLTTTEGLSSTPQRAGIAFSS
KNLPASSGPPSYVSTPKKNSTSRVQPTTNFQTLSRPDIVPVIVPRSNSLRPETT
SDAKKEMNNFGRVVPSTVSTKSTDVIKSGSNRDESDKIDSINQKRMTGNDKTDL
NIARAEQHVSSRLDNTNTSSVVCDGNQPAARWIGAAKFRRNSPVDPVVSPHDRS
PTFPWSATDDGVTCQPDRQVTAPELSKRVVEPGRARALVASWETREKALTADTP
VLVSGRPPTSPGVDMNSFIPRGSHGTSESDLTVSDDNSAIEELMQQHNAFTSIL
QARLTKLQVIRRFWQRNDLKGAIDATGKMGDHSVSADVISVLIERSEIFTLDIC
TVILPLLTRLLQSETDRHLTVAMETLLVLVKTFGDVIRATISATPTIGVDLQAE
QRLERCNLCYVELENIKQILVPLIRRGGAVAKSAQELSLALQEV
512 Cyclin B MAGSDENNPGVVGGAHVQEGLRVGAGKMGAGNVQQRRALSNINSNIIGAPPYPC 238 1648
AVNKRVLSEKNVNSENDLLNAAHRPITRQFAAQMAYKQQLRPEENKRTTQSVSN
PSKSEDCAILDVDDDKMADDFPVPMFVQHTEAMLEEIDRMEEVEMEDVAEEPVT
DIDSGDKENQLAVVEYIDDLYMFYQKAEASSCVPPNYMDRQQDINERMRGILID
WLIEVHYKFELMDETLYLTVNLIDRFLAVQPVVKKKLQLVGVTAMLLACKYEEV
SVPVVEDLILISDRAYSRKEVLEMERLMVNTLHFNMSVPTPYVFMRRFLKAAQS
DKKLELLSFFIIELSLVEYDMLKFPPSLLAASAIYTALSTITRTKQWSTTCEWH
TSYSEEQLLECARLMVTFHQRAGSGKLTGVHRKYSTSKFGHAARTEPANFLLDF
RL
513 Cyclin- MQAPREGKSAAAIVGMGKYMKKSKAIPRDVSLLEASPRSPSATGVRTRAKTLAS 59 859
dependant RRLRRASQRRPPPPAAAAAAAAPSLDASPCPFSYLQLRSRRLRRPRLAPSPEAR
kinase IDEGPAGSGSRGSRDASCSARTASSSGGVEGEGACVGRGDRGNGGECVRDAAVD
inhibitor ASYGENDLEIEDRDRSTRESTPCSLIRDSNANTPPGSTTRQQSSCTAHRTQMSI
LRSIPTSDEMEEFFAYAEQRQQRSFIEKYNFDIVKDRPLPGRFEWVQVIP
514 Histone MDGHSSHLAAQNRSRGSQTPSPSHSAASASATSSIHLKRKLSAANASAASAAAA 44 1829
acetyltransferase AAAAAAAADDHAPPFPPSSISADTRDGALTSNDDLESISARGGGAGDDSDDDSD
DEEEDDGDNDGGSSLRTFTAARLENVGPAAARNRKIKAESNATVKVEKEDSAKD
GGNGAGVGALGPAATSGAGSGSGTVPKEDAVKIFTENLQASGAYSAREENLKRE
EEAGRLKFECLSNDGVDDHMVWLIGLKNIFARQLPNMPKEYIVRLVMDRNHKSV
MVIRRNLVVGGITYRPYASQKFGEIAFCAIKADEQVKGYGTRLMNHLKQHARDV
DGLTHFLTYADNNAVGYFIKQGFTKEIYLDKDRWHGYIKDYDGGILMECKIDPK
LPYTDLSTMVRRQRQAIDEKIRELSNCHIVYQGIDFQKRDAGVPQNTIKMEDIP
GLREAGWTPDQWGYSRFRGLSDQKRLTFFIRQLLKVLNDHSDAWPFKEPVDARE
VPDYYDIIKDPMDLKTMTKRVESEQYYVTLEMFIADVKRMFANARTYNSPDTIY
FKIATRLEAHFQSKVQSNLQSGAGKIQQ
515 Peptidylprolyl MFNGMMDPELFKLAQEQMNRMSPAELAKIQQQMMSNPELMRMASESMKNMRPED 109 1866
isomerase LRQAAEQLKHVRPEEMAEIGEKMANASPEEIAAVRARADAQMTYEINAAKILKK
EGNELHSQGRFKDASQKYLRAKNNLKGIPSSEGKNLLLACSLNLMSCYLKTRQY
EECIKEGSEALACEEKNLKAFYRRGQAYRELGQLKDAVSDLRKAHEISPDDETI
AQVLRDTEESLTKEGGSAPRGVVIEEITEEDETLASVNHESPSEYSEKRHQESE
DAHKGPINGDIMGQMTNSESLKALKGDPDAIRSFQNFISNADPTTLAAMGAGNA
GEVSPDLIKTASSMIGKMSAEELQKMIQLASSFPGENPYVTRNSDSNSNSFGNG
SIPNVSPDMLKTASDMMSKMSPDDLQRMFEMASSSRGKDPSLDANHASSSSGAN
LAANLNHILGESEPSSSYHIPSSSRNISSSPLSNFPSSPGDMQEQIRNQMKDPA
MRQMFTSMMKNMSPEMMANMGKQFGLELSPEDAAKAQEAMSSLSPEMLDKMMRW
ADRAQRGVETAKKTKNWLLGRPGMILAICMLLLAVILHRLGFIGS
516 WD40 repeat MIAAISWVPRGASKAVPEVAEPPSKEEIEEILKSGVVERSGDSDGEEDDENMDA 212 1815
protein VASEKADEVSTALSAADALGRISKVTKAGSGFEDIADGLRELDMDNYDEEDEDV
KLFSTGLGDLYYPSNDMDPYLKDKDDDDDTEEIEDLSIKPMDSLIVCARTDDEV
NLLEVYLLEPSLSDESNMYVHHEVVISEFPLCTAWLDCPIKGGDKGNFIAVGSM
EPAIEIWDLDIIDAVEPCLVLGGQEELKKKKKKGKKASIKYKEGSHTDSVLGLA
WNKEFRNILASASADRQVKIWDVAAGKCNITMEHHTDKVQAVAWNHHAPQVLLS
GSFDHSVVMKDGRIPSHSGYRWSVTADVESLAWDPHSEHFFVVSLEDGTVRGFD
VRAAISNSASQSLPSFTLHAHEKAVSTISYNPAAPNLLATGSTDKMVKLWDLSN
NQPSCIASRNPKAGAVFSVSFSEDSPLLLAIGGSKGRLEVWDTSSDAAVSRRFG
KHGKPKTAEPGS
517 WD40 repeat MKFCKKYQEYMQGQEGKKLPGLGFKKLKKILKRCRRRDSLHSQKALQAVQNPRT 207 1193
protein CPAHCSVCDGSFFPSLLEEMSAVLGCFNKQAQKLLELHLASGFQKYLMWFKGKL
RGNHVALIQEGKDLVTYALINAIAIRKILKKYDKIHLSTQGQAFKSQVQRMHME
ILQSPWLCELIAFHINVRETKANSGKGHALFEGCSLVVDDGKPSLSCELFDSIK
LDIDLTCSICLDTVFDSVSLTCGHIYCYMCACSAASVTIVDGLKAAEPKEKCPL
CREARVFEGAVHLDELNILLSRSCPEYWAERLQTERVERVRQAKEHWESQCRAF
MGVE
518 WD40 repeat MVSTQSTRENPSIFFPPPLKPWLLPVVLSLSLSRQLGMAAAAAASLPFKKNYRS 6 2786
protein SQALQQFYAGGPFAVSSDGSFIACNCGDSIKIVDSSNASLRPSIDCGSDTITAL
SLSPDGKLLFSAGHSRQIRVWDLSTSTCLRSWKGHDGPVMSMACPVSGGLLATG
GADRKVMVWDVDGGFCTHFFKGHDGVVSTVLFHPDSNRSLLFSGSDDGTIRVWD
LLAKKCASTLRGHDSTVTSLAFSEDGLTLLAAGRDKVVSLWDLHNYACKKTIPM
YEVLESVCVIHSGTVLASQLGLDDQLKVTKESAQNIHFITVGERGILRIWKSEG
SVCLFKQEHSDVTVISDEDDSRSGFTAAVMLPLDQGLLCVTADQQFLFYYPEKH
PEGIFSLTLCRRLVGYNEEIVDMKFLGEEENFLAVATNLEQVRVYELASMSCSY
VLAGHTETVLCLDTCISSSGRTLIVTGSKDNSVRLWDSESRHCIGVGVGHMGAV
GAVAFSRKRQDFFVSGSSDRTLKVWSLDGISEDGVDSTNLKAKAVVAAHDKDIN
SVAVAPNDSLVCSGSQDRTACVWRLPDLVSVVVLKGHKRGIWSVEFSPVDQCVL
TASGDKTVKIWAISDGSCLKTFEGHVSSVLRASFLTRGTQFVSCGADGLVKLWT
VRTNECIATYDQHSDKVWALAVGKKTEMLATGGSDAVVNLWYDSTASDKEDAFR
KEEEGVLKGQELENAVSDADYTKAIELALELRRPHKLFELFSELCRTREVGDRV
ERILSALSGEEVCLLLEYIREWNAKPKLCHVAQSVLSQVFRILSPTEIVEIKGI
GELLEGLIPYSQRHFSRIDRLVRSTYLLDYTLTGMSVIEPEADRSAVNDGSPDK
SGLEKLEDGLLGENVGEEKIQNKEELESSAYKKRKLPRSKDRSKKKSKNVVYAD
AAAISFRA
519 WD40 repeat MDSAPRRKSGGINLPSGMSETSLRLDGFSGSSSSFRAISNLTSPSKSSSISDRF 213 1726
protein IPCRSSSRLHTFGLVERGSPVKEGGNEAYSRLLKAELFGSDFGSLSPAGQGSPM
SPSKNMLRFKTESSGPNSPFSPSILRQDSGFSSEASTPPKPPRKVPKTPHKVLD
APSLQDDFYLNLVDWSSQNTLAVGLGTCVYLWSASNSKVTKLCDLGPNDGVCAV
QWTREGSYISIGTSLGQVQIWDGTQCKRVRTMGGHQTRTGVLAWNSRILASGSR
DRVILQHDLRVPNEFIGKLVGHKSEVCGLKWSHDDRELASGGNDNQLLVWNQHS
QQPVLKLTEHTAAVKAIAWSPHQNGLLASGGGTADRCIRFWNTTNGHQTSSVDT
GSQVCNLAWSKNVNELVSTHGYSQNQIMVWKYPSMAKVATLTGHSLRVLYLAMS
PDGQTIVTGAGDETLRFWNVFPSAKAPAPVKDTGLWSLGRTHIR
520 WD40 repeat MEDEAEIYDGVRAQFPLTFGKQSKPQTSLESVHSATRRGGPAPAPAPASSSSLP 101 2110
protein STTSPSAAGGAGKSSGLPSLSSSSTAWLEGLRAGNPRAGREAGIGSRGGDGEDG
GRAMIGPPRPPPGFSANDDGGGEDDDDDGDGVMVGPPPPPPGNLGDGDDDEEEE
EAMIGPPRPPVVDSDEEEEEEEEENRYRLPLSNEIVLKGHNKIVSALAVDPTGS
RVLSGSYDYTVRMFDFQSMNSRLSSFRDFEPVEGHQVRNLSWSPTADRFLCVTG
SAQAKIYDRDGLTLGEFVKGDMYIRDLKNTKGHITGLTWGEWHPKTKETILTSS
EDGSLRIWDVNDFKSQKQVIKPKLARPGRVPVTTCTWDREGKCIAGGIGDGSIQ
IWNLKPGWGSRPDIHVEQAHADDITGLKFSSDGKILLTRSFDDSLKVWDLRLMK
NPLKVFEDLPNHYAQTNIACSPDEQLFLTGTSVERESTIGGLLCFFDRSKLELV
SRIGISPTCSVVQCAWHPRLNQIFATSGDKSQGGTHVLYDPTLSERGALVCVAR
APRKKSVDDFELKPVIHNPHALPLFRDQPSRKRQREKILKDPLKSHKPELPMNG
PGHGGRVGASKGSLLTQYLLKQGGMIKETWMDEDPREAILKHADAAEKNPKFTR
AYAETQPDPVFAKSDSEDEDK

TABLE 16
BLAST Sequence Alignment Table.
BlastX top BlastX e BlastX BlastX
SEQ ID Target Patent Identifier hit Gene name value identities overlap
1 CDK type A eucalyptusSpp_003910 Q9FRN5 PUTATIVE 0 367 492
SERINE/THREONINE
KINASE
2 CDK type A eucalyptusSpp_019213 O44000 CDC2-LIKE e−160 217 290
PROTEIN
KINASE TPK2
3 CDK type A eucalyptusSpp_036800 Q40789 PROTEIN 0 259 294
KINASE
P34CDC2
4 CDK type A eucalyptusSpp_040260 Q27168 CDC2 e−156 208 304
5 CDK type A eucalyptusSpp_041965 Q43361 CDC2PA mRNA. e−159 274 294
SPTREMBL
6 CDK type B-1 eucalyptusSpp_002906 Q9FYT9 Cyclin- e−159 269 305
dependent
kinase B1-1
7 CDK type B-2 eucalyptusSpp_001518 Q9FSH4 B2-TYPE 0 270 315
CYCLIN
DEPENDENT
KINASE
8 CDK type C eucalyptusSpp_008078 Q9LDC1 CRK1 protein 0 415 558
9 CDK type C eucalyptusSpp_009826 Q9LNN0 F8L10.9 0 392 716
protein.
SPTREMBL
10 CDK type C eucalyptusSpp_010364 Q8GZA7 Putative e−172 309 499
cyclin-
dependent
protein
kinase.
11 CDK type C eucalyptusSpp_011523 Q8W2N0 Cyclin- e−165 273 405
dependent
kinase CDC2C
12 CDK type C eucalyptusSpp_024358 P93320 CDC2MSC 0 448 523
PROTEIN
13 CDK type C eucalyptusSpp_039125 O80540 F14J9.26 0 418 743
protein
14 CDK type D eucalyptusSpp_005362 O80345 CDK- e−180 305 483
activating
kinase 1AT
(Cdk-
activating
kinase
CAK1At)
15 CDK type D eucalyptusSpp_044857 O80345 CDK- e−177 302 477
activating
kinase 1AT
(Cdk-
activating
kinase
CAK1At)
16 Cyclin A eucalyptusSpp_001743 Q39879 MITOTIC 0 360 508
CYCLIN A2-
TYPE
17 Cyclin A eucalyptusSpp_012405 Q39878 MITOTIC e−179 278 470
CYCLIN A2-
TYPE
18 Cyclin B eucalyptusSpp_003739 Q9LDM4 F2D10.10 e−148 288 466
(F5M15.6)
19 Cyclin B eucalyptusSpp_022338 P93557 Mitotic e−168 310 476
cyclin
20 Cyclin B eucalyptusSpp_028605 Q40337 B-like e−158 300 439
cyclin.
SPTREMBL
21 Cyclin B eucalyptusSpp_041006 Q40337 B-like e−158 300 439
cyclin
22 Cyclin D eucalyptusSpp_006643 Q9SXN7 NtcycD3-1 1E−73 177 404
protein
23 Cyclin D eucalyptusSpp_045338 Q8LK74 Cyclin D3.1 e−101 190 332
protein.
SPTREMBL
24 Cyclin D eucalyptusSpp_046486 Q9ZRX7 CYCLIN D3.2 e−126 196 373
PROTEIN
25 Cyclin- eucalyptusSpp_012070 CAB69358 SEQUENCE 1 8E−64 83 88
dependent FROM PATENT
kinase WO9841642
regulatory
subunit
26 Histone eucalyptusSpp_006617 O80378 181 0 371 395
acetyltransferase (Fragment)
27 Histone eucalyptusSpp_007827 Q9FJT8 Histone e−148 260 465
acetyltransferase acetyltransferase
HAT B
28 Histone eucalyptusSpp_008036 Q9FJT8 Histone e−149 262 465
acetyltransferase acetyltransferase
HAT B.
SPTREMBL
30 Histone eucalyptusSpp_001596 Q9M4T5 Putative 7E−76 156 305
deacetylase histone
deacetylase
HD2
31 Histone eucalyptusSpp_005870 Q9M4T4 Putative 7E−66 144 318
deacetylase histone
deacetylase
HD2c
(AT5g03740/F17C15_160)
32 Histone eucalyptusSpp_006901 HDAC_ARATH Histone 0 405 499
deacetylase deacetylase
(HD)
33 Histone eucalyptusSpp_006902 AAM13152 HISTONE 0 427 499
deacetylase DEACETYLASE
34 Histone eucalyptusSpp_007440 Q8W508 HISTONE 0 369 428
deacetylase DEACETYLASE
35 Histone eucalyptusSpp_008994 Q8LD93 Histone 0 354 536
deacetylase deacetylase,
putative
36 Histone eucalyptusSpp_024580 Q94EJ2 At1g08460/T27G7_7 e−165 274 373
deacetylase (HDA8).
SPTREMBL
37 Histone eucalyptusSpp_037831 Q9FML2 Histone 0 356 464
deacetylase deacetylase.
SPTREMBL
38 MAT1 CDK- eucalyptusSpp_034958 Q8LES8 Hypothetical 4E−47 101 190
activating protein
kinase
assembly
factor
39 Peptidylprolyl 001209EGXC004488HT TL40_SPIOL Peptidylprolyl 0 329 392
isomerase cis-
trans
isomerase,
chloroplast
precursor
40 Peptidylprolyl 010310EGXD012820HT Q9FJL3 PEPTIDYLPROLYL 0 453 579
isomerase ISOMERASE
41 Peptidylprolyl 010310EGXD013036HT O82646 HYPOTHETICAL 0 302 521
isomerase 57.1 KDA
PROTEIN (EC
5.2.1.8)
42 Peptidylprolyl 010316EGXF999037HT BAB39983 PUTATIVE e−115 146 172
isomerase PEPTIDYLPROLYL
CIS-
TRANS
ISOMERASE,
CHLOROPLAST
43 Peptidylprolyl 010324EGXF002118HT AAK32894 AT5G13120/T19L5_80 e−122 179 264
isomerase
44 Peptidylprolyl 011019EGKA001923HT AAM14253 HYPOTHETICAL e−108 146 188
isomerase 20.3 KDA
PROTEIN
45 Peptidylprolyl eucalyptusSpp_000966 Q8L5T1 Peptidylprolyl 1E−91 155 170
isomerase isomerase
(Cyclophilin)
(EC
5.2.1.8)
46 Peptidylprolyl eucalyptusSpp_001037 Q8VX73 CYCLOPHILIN e−120 155 169
isomerase (EC 5.2.1.8)
47 Peptidylprolyl eucalyptusSpp_004603 AAM14253 HYPOTHETICAL e−108 146 188
isomerase 20.3 KDA
PROTEIN.
48 Peptidylprolyl eucalyptusSpp_005465 Q9SP02 Cyclophilin 2E−93 172 204
isomerase ROC7 (EC
5.2.1.8)
(AT5g58710/mzn1_160)
(Pepti . . .
49 Peptidylprolyl eucalyptusSpp_006571 O49605 EC 5.2.1.8 9E−98 169 224
isomerase (Cyclophilin-
like
protein)
(Peptidyl-
prolyl
50 Peptidylprolyl eucalyptusSpp_006786 Q93VG0 Cyclophilin 5E−82 142 164
isomerase (EC 5.2.1.8)
(Peptidyl-
prolyl cis-
trans
51 Peptidylprolyl eucalyptusSpp_007057 Q38901 Cytosolic 3E−84 144 172
isomerase cyclophilin
(EC 5.2.1.8)
(Peptidyl-
prolyl
52 Peptidylprolyl eucalyptusSpp_008670 Q9FJL3 PEPTIDYLPROLYL 0 423 596
isomerase ISOMERASE
53 Peptidylprolyl eucalyptusSpp_009137 Q9C566 Cyclophilin- e−168 285 361
isomerase 40 (EC
5.2.1.8)
(Expressed
protein)
54 Peptidylprolyl eucalyptusSpp_010285 Q9LY75 Cyclophylin- e−160 345 658
isomerase like protein
(EC 5.2.1.8)
(Peptidyl-
prolyl
55 Peptidylprolyl eucalyptusSpp_010600 Q93YQ8 HYPOTHETICAL 0 346 475
isomerase 50.1 KDA
PROTEIN
(FRAGMENT)
56 Peptidylprolyl eucalyptusSpp_011551 Q9ZVG4 T2P11.13 e−115 154 192
isomerase PROTEIN
57 Peptidylprolyl eucalyptusSpp_020743 Q8VXA5 PUTATIVE e−125 161 172
isomerase CYCLOSPORIN
A-BINDING
PROTEIN
58 Peptidylprolyl eucalyptusSpp_023739 FK21_NEUCR FK506- 3E−49 74 112
isomerase binding
protein
precursor
(FKBP-21)
60 Peptidylprolyl eucalyptusSpp_031985 Q8L8W5 Cyclophilin- 1E−82 155 229
isomerase like protein
(EC 5.2.1.8)
(Peptidyl-
prolyl
61 Peptidylprolyl eucalyptusSpp_032025 Q9LPC7 F22M8.7 1E−45 99 160
isomerase protein (EC
5.2.1.8)
(Peptidyl-
prolyl cis-
trans
62 Peptidylprolyl eucalyptusSpp_032173 Q8L8W5 Cyclophilin- 4E−83 156 229
isomerase like protein
(EC 5.2.1.8)
(Peptidyl-
prolyl
64 Retinoblastoma eucalyptusSpp_009143 Q9SLZ4 Retinoblastoma- 0 704 1008
related related
protein protein
65 WD40 repeat eucalyptusSpp_000349 AAK49947 TGF-BETA 0 291 326
protein RECEPTOR-
INTERACTING
PROTEIN 1
66 WD40 repeat eucalyptusSpp_000575 Q9LW17 WD-40 repeat e−168 282 341
protein protein-like
(Expressed
protein)
67 WD40 repeat eucalyptusSpp_000804 GBLP_SOYBN Guanine 0 291 326
protein nucleotide-
binding
protein beta
subunit-like
68 WD40 repeat eucalyptusSpp_000805 GBLP_MEDSA Guanine e−171 291 327
protein nucleotide-
binding
protein beta
69 WD40 repeat eucalyptusSpp_000806 GBLP_MEDSA Guanine e−171 291 327
protein nucleotide-
binding
protein beta
subunit-like
70 WD40 repeat eucalyptusSpp_002248 AAL86002 HYPOTHETICAL 0 261 388
protein 43.8 KDA
PROTEIN
71 WD40 repeat eucalyptusSpp_003203 Q9SY00 Putative WD- e−144 236 317
protein repeat
protein
(AT4G02730/T5J8_2)
72 WD40 repeat eucalyptusSpp_003209 AAM14986 HYPOTHETICAL e−160 259 302
protein 32.6 KDA
PROTEIN
73 WD40 repeat eucalyptusSpp_004429 Q9SZQ5 HYPOTHETICAL 0 260 322
protein 34.3 KDA
PROTEIN
74 WD40 repeat eucalyptusSpp_004607 AAC27402 EXPRESSED 0 253 356
protein PROTEIN
75 WD40 repeat eucalyptusSpp_004682 AAK00964 HYPOTHETICAL 0 264 313
protein 35.3 KDA
PROTEIN
76 WD40 repeat eucalyptusSpp_005786 Q944S2 At2g47790/F17A22.18 e−155 264 396
protein (Expressed
protein).
SPTREMBL
77 WD40 repeat eucalyptusSpp_005887 Q94AB4 AT3g13340/MDC11_13 0 332 446
protein
78 WD40 repeat eucalyptusSpp_005981 Q8L4X6 WD-repeat 0 315 348
protein protein
GhTTG2.
SPTREMBL
79 WD40 repeat eucalyptusSpp_006766 Q8L4M1 Putative WD- e−137 234 369
protein 40 repeat
protein
80 WD40 repeat eucalyptusSpp_006769 Q9LJC6 RETINOBLASTOMA- 0 372 566
protein BINDING
PROTEIN-LIKE
81 WD40 repeat eucalyptusSpp_006907 Q94C94 Hypothetical 0 446 812
protein protein.
82 WD40 repeat eucalyptusSpp_007518 Q93ZN5 AT4G00090/F6N15_8 0 311 436
protein
83 WD40 repeat eucalyptusSpp_007717 O82266 At2g47990 e−180 327 528
protein protein
(Hypothetical
58.9 kDa
protein)
84 WD40 repeat eucalyptusSpp_007718 Q8RWD8 Hypothetical e−173 278 350
protein protein.
SPTREMBL
85 WD40 repeat eucalyptusSpp_007741 Q8LA40 Putative WD- e−158 269 409
protein 40 repeat
protein,
MSI2
86 WD40 repeat eucalyptusSpp_007884 Q9FHY2 Similarity e−149 316 765
protein to unknown
protein
87 WD40 repeat eucalyptusSpp_008258 Q9LHN3 EMB|CAB63739.1 0 524 758
protein (AT3G18860/MCB22_3)
88 WD40 repeat eucalyptusSpp_008465 Q9FLS2 WD-repeat 0 366 460
protein protein-like
89 WD40 repeat eucalyptusSpp_008616 Q9LYK6 Hypothetical e−148 252 321
protein protein
90 WD40 repeat eucalyptusSpp_008690 Q9SW94 G PROTEIN 0 326 376
protein BETA SUBUNIT
91 WD40 repeat eucalyptusSpp_008708 Q8L862 Hypothetical e−167 297 487
protein protein
92 WD40 repeat eucalyptusSpp_008850 O22725 F11P17.7 0 402 853
protein protein.
SPTREMBL
93 WD40 repeat eucalyptusSpp_009072 Q9SAJ0 F23A5.2 (form e−176 288 350
protein 2) (mRNA
export
protein,
putative)
94 WD40 repeat eucalyptusSpp_009465 Q9FLX9 NOTCHLESS 0 384 475
protein PROTEIN
HOMOLOG
95 WD40 repeat eucalyptusSpp_009472 Q9SZA4 WD-REPEAT 0 374 457
protein PROTEIN-LIKE
PROTEIN
96 WD40 repeat eucalyptusSpp_009550 Q9FKT5 Gb|AAF54217.1 e−167 275 313
protein (Hypothetical
protein)
97 WD40 repeat eucalyptusSpp_010284 O22466 WD-40 repeat 0 397 423
protein protein MSI1
98 WD40 repeat eucalyptusSpp_010595 Q94C94 Hypothetical 0 419 789
protein protein
99 WD40 repeat eucalyptusSpp_010657 Q94AH2 HYPOTHETICAL 0 243 298
protein 33.1 KDA
PROTEIN
100 WD40 repeat eucalyptusSpp_012636 Q8L611 Hypothetical 0 756 1133
protein protein
101 WD40 repeat eucalyptusSpp_012748 AAD10151 PUTATIVE WD- 0 375 469
protein 40 REPEAT
PROTEIN,
MSI4
102 WD40 repeat eucalyptusSpp_012879 Q8VZY6 FERTILIZATION- 0 291 377
protein INDEPENDENT
ENDOSPERM
PROTEIN
103 WD40 repeat eucalyptusSpp_015515 Q8LPI5 Putative WD- 0 360 493
protein repeat
protein.
SPTREMBL
104 WD40 repeat eucalyptusSpp_015724 O22607 WD-40 repeat 0 395 522
protein protein MSI4
105 WD40 repeat eucalyptusSpp_016167 Q93YS7 Putative WD- 0 663 917
protein repeat
membrane
protein
106 WD40 repeat eucalyptusSpp_016633 Q9SUY6 HYPOTHETICAL e−174 240 384
protein 43.8 KDA
PROTEIN
107 WD40 repeat eucalyptusSpp_017485 Q8RXC4 Hypothetical 0 650 1348
protein 144.7 kDa
protein
108 WD40 repeat eucalyptusSpp_018007 O94289 WD repeat- e−129 302 794
protein containing
protein
109 WD40 repeat eucalyptusSpp_020775 Q8W403 Sec13p e−150 242 304
protein
110 WD40 repeat eucalyptusSpp_023132 AAK52092 WD-40 REPEAT 0 458 515
protein PROTEIN
111 WD40 repeat eucalyptusSpp_023569 Q9XIJ3 T10O24.21. 0 404 576
protein SPTREMBL
112 WD40 repeat eucalyptusSpp_023611 Q8L4J2 Cleavage e−174 301 438
protein stimulation
factor 50K
chain
(Cleavage
stimulation
113 WD40 repeat eucalyptusSpp_024934 Q94AB4 AT3g13340/MDC11_13. 0 343 444
protein WD-
repeat
protein-like
SPTREMBL
114 WD40 repeat eucalyptusSpp_025546 O22212 Hypothetical 0 352 566
protein 61.8 kDa
Trp-Asp
repeats
containing
protein
115 WD40 repeat eucalyptusSpp_030134 Q9LVF2 Genomic DNA, 0 677 946
protein chromosome
3, P1 clone:
MIL23
116 WD40 repeat eucalyptusSpp_031787 AAL91206 WD REPEAT 0 264 329
protein PROTEIN-LIKE
117 WD40 repeat eucalyptusSpp_034435 Q9SAJ0 F23A5.2(form e−178 290 349
protein 2) (mRNA
export
protein,
putative).
SPTREMBL
118 WD40 repeat eucalyptusSpp_034452 Q94BR4 Hypothetical 0 381 525
protein protein
(Putative
pre-mRNA
splicing
factor
119 WD40 repeat eucalyptusSpp_035789 P93563 Guanine 3E−88 171 356
protein nucleotide-
binding
protein beta
subunit
120 WD40 repeat eucalyptusSpp_035804 Q9FNN2 WD-repeat 0 356 589
protein protein-
like.
SPTREMBL
121 WD40 repeat eucalyptusSpp_043057 Q9LV35 WD40-repeat 0 472 610
protein protein.
SPTREMBL
122 WD40 repeat eucalyptusSpp_046741 Q93VK1 AT4g28450/F20O9_130 0 363 452
protein
123 WD40 repeat eucalyptusSpp_047161 Q9ZUN8 Putative WD- 0 350 473
protein 40 repeat
protein
124 CDK type A pinusRadiata_001766 Q9M3W7 PUTATIVE e−128 237 436
CDC2-RELATED
PROTEIN
KINASE CRK2.459
e−128
125 CDK type A pinusRadiata_002927 Q9FRN5 PUTATIVE 0 349 470
SERINE/THREONINE
KINASE
126 CDK type B-1 990309PRCA009171HT Q9FYT8 Cyclin- e−145 244 303
dependent
kinase B1-2
127 CDK type B-1 pinusRadiata_013714 Q9FYT8 CYCLIN- e−174 222 304
DEPENDENT
KINASE B1-2
128 CDK type B-1 pinusRadiata_016332 Q9FYT8 CYCLIN- e−178 228 304
DEPENDENT
KINASE B1-2
129 CDK type B-1 pinusRadiata_021677 Q9FYT8 CYCLIN- e−176 229 304
DEPENDENT
KINASE B1-2
130 CDK type B-1 pinusRadiata_027562 Q9FYT8 Cyclin- e−118 211 304
dependent
kinase B1-2
131 CDK type C pinusRadiata_001504 Q9LNN0 F8L10.9 0 434 790
protein
132 CDK type C pinusRadiata_015211 Q9LNN0 F8L10.9 0 371 746
protein
133 CDK type C pinusRadiata_020421 P93320 Cdc2MsC 0 318 432
protein
134 CDK type D pinusRadiata_003187 O80345 CDK- e−137 226 485
ACTIVATING
KINASE 1AT
(CDK-
ACTIVATING
KINASE
CAK1AT)
135 CDK type D pinusRadiata_015661 Q947K6 CDK- 0 266 407
ACTIVATING
KINASE.
136 Cyclin A pinusRadiata_013874 Q96226 Cyclin e−108 223 474
137 Cyclin A pinusRadiata_014615 CAC27333 PUTATIVE A- 0 332 390
LIKE CYCLIN
(FRAGMENT)
138 Cyclin B pinusRadiata_004578 O65064 Probable 9E−87 162 217
G2/mitotic-
specific
cyclin
(Fragment)
139 Cyclin B pinusRadiata_023387 O04389 B-like 2E−98 220 466
cyclin
140 Cyclin D pinusRadiata_006970 P93103 CYCLIN-D 1E−75 135 293
LIKE PROTEIN
141 Cyclin D pinusRadiata_010322 CAC17049 SEQUENCE 33 e−131 171 254
FROM PATENT
WO0065040
142 Cyclin D pinusRadiata_022721 P93103 CYCLIN-D 1E−76 137 289
LIKE PROTEIN
143 Cyclin D pinusRadiata_023407 Q9SMD5 CYCD3,2 8E−90 139 278
PROTEIN
144 Cyclin- pinusRadiata_001945 Q947Y1 PUTATIVE 5E−55 74 86
dependent CYCLIN-
kinase DEPENDENT
regulatory KINASE
subunit REGULATORY
SUBUNIT
145 Cyclin- pinusRadiata_008233 CAB69358 SEQUENCE 1 4E−49 65 86
dependent FROM PATENT
kinase WO9841642
regulatory
subunit
146 Cyclin- pinusRadiata_008234 CAB69358 SEQUENCE 1 4E−49 65 86
dependent FROM PATENT
kinase WO9841642
regulatory
subunit
147 Cyclin- pinusRadiata_022054 CAB69358 SEQUENCE 1 8E−55 70 82
dependent FROM PATENT
kinase WO9841642
regulatory
subunit
148 Histone pinusRadiata_012137 Q9FK40 Histone 0 496 555
acetyltransferase acetyltransferase
(AT5g50320/MXI22_3)
149 Histone pinusRadiata_012582 O80378 181 0 354 402
acetyltransferase (Fragment)
SPTREMBL
150 Histone pinusRadiata_015285 O80378 181 0 342 401
acetyltransferase (Fragment)
151 Histone pinusRadiata_017229 Q9LNC4 F9P14.9 e−118 268 585
acetyltransferase protein
152 Histone pinusRadiata_020724 Q9AR19 Histone e−177 355 639
acetyltransferase acetyltransferase
GCN5
(Expressed
protein)
153 Histone pinusRadiata_004555 AAM13152 HISTONE 0 331 488
deacetylase DEACETYLASE
154 Histone pinusRadiata_004556 AAM13152 HISTONE 0 331 488
deacetylase DEACETYLASE
155 Histone pinusRadiata_005729 Q9M4U5 Histone 9E−62 154 348
deacetylase deacetylase
2 isoform b
156 Histone pinusRadiata_007395 AAM13152 HISTONE 0 335 426
deacetylase DEACETYLASE
157 Histone pinusRadiata_009503 Q8W508 Histone 0 365 427
deacetylase deacetylase
158 Histone pinusRadiata_011283 AAM19887 AT1G08460/T27G7_7 0 255 366
deacetylase
159 Histone pinusRadiata_012322 Q9FML2 HISTONE 0 327 435
deacetylase DEACETYLASE
(PUTATIVE
HISTONE
DEACETYLASE)
161 Histone pinusRadiata_023236 Q8RX28 Putative e−144 238 390
deacetylase histone
deacetylase
162 Peptidylprolyl pinusRadiata_000171 Q9FJL3 PEPTIDYLPROLYL 0 364 549
isomerase ISOMERASE
163 Peptidylprolyl pinusRadiata_000172 Q38949 FK506 0 365 552
isomerase BINDING
PROTEIN
FKBP62
(ROF1)
164 Peptidylprolyl pinusRadiata_001480 Q8VXA5 PUTATIVE e−125 161 172
isomerase CYCLOSPORIN
A-BINDING
PROTEIN
168 Peptidylprolyl pinusRadiata_001692 FKB7_WHEAT 70 kDa 0 418 553
isomerase peptidylprolyl
isomerase
(EC 5.2.1.8)
169 Peptidylprolyl pinusRadiata_005313 AAB64339 FKBP-TYPE 1E−97 135 175
isomerase PEPTIDYL-
PROLYL CIS-
TRANS
ISOMERASE
170 Peptidylprolyl pinusRadiata_006362 BAB39983 PUTATIVE 3E−77 129 168
isomerase PEPTIDYL-
PROLYL CIS-
TRANS
ISOMERASE,
CHLOROPLA . . .
290 3e−77
171 Peptidylprolyl pinusRadiata_006493 Q9C835 Hypothetical 2E−62 128 235
isomerase 26.4 kDa
protein (EC
5.2.1.8)
(Peptidyl-
prol . . .
172 Peptidylprolyl pinusRadiata_006983 AAK96784 CYCLOPHILIN e−103 151 204
isomerase
174 Peptidylprolyl pinusRadiata_007665 Q9LDC0 FKBP-like e−138 239 378
isomerase protein
(Genomic
DNA,
chromosome
3, P1 clone:
175 Peptidylprolyl pinusRadiata_012196 Q93VG0 Cyclophilin 4E−74 132 160
isomerase (EC 5.2.1.8)
(Peptidyl-
prolyl cis-
trans
176 Peptidylprolyl pinusRadiata_013382 Q9C588 HYPOTHETICAL 0 288 581
isomerase 60.2 KDA
PROTEIN
177 Peptidylprolyl pinusRadiata_016461 O04287 IMMUNOPHILIN 9E−66 88 109
isomerase
178 Peptidylprolyl pinusRadiata_017611 Q9C566 Cyclophilin- e−163 276 360
isomerase 40 (EC
5.2.1.8)
(Expressed
protein)
179 Peptidylprolyl pinusRadiata_019776 AAM14253 HYPOTHETICAL e−110 146 190
isomerase 20.3 KDA
PROTEIN
180 Peptidylprolyl pinusRadiata_020659 AAO63961 Hypothetical 7E−85 159 227
isomerase protein
SPTREMBL
181 Peptidylprolyl pinusRadiata_022559 AAK43974 PUTATIVE 2E−73 113 153
isomerase PEPTIDYL-
PROLYL CIS-
TRANS
ISOMERASE
182 Peptidylprolyl pinusRadiata_024188 Q9P3X9 PEPTIDYL- e−122 210 379
isomerase PROLYL CIS-
TRANS
ISOMERASE
(EC 5.2.1.8)
183 Peptidylprolyl pinusRadiata_027973 Q9SR70 T22K18.11 3E−69 125 171
isomerase protein
(AT3g10060/T22K18_11)
184 WD40 repeat pinusRadiata_001353 Q9FNN2 WD-repeat 0 317 590
protein protein-
likeSPTREMBL
185 WD40 repeat pinusRadiata_001978 PRL1_ARATH PP1/PP2A 0 341 502
protein phosphatases
pleiotropic
regulator
PRL1
186 WD40 repeat pinusRadiata_002810 AAK49947 TGF-BETA 0 273 326
protein RECEPTOR-
INTERACTING
PROTEIN 1
187 WD40 repeat pinusRadiata_002811 AAK49947 TGF-BETA 0 273 326
protein RECEPTOR-
INTERACTING
PROTEIN 1
188 WD40 repeat pinusRadiata_002812 AAM15129 HYPOTHETICAL e−127 225 521
protein 58.9 KDA
PROTEIN
189 WD40 repeat pinusRadiata_003514 Q9FJ94 Similarity e−137 242 445
protein to myosin
heavy chain
kinaseSPTREMBL
190 WD40 repeat pinusRadiata_004104 GBB_ORYSA Guanine 0 294 378
protein nucleotide-
binding
protein beta
subunit
191 WD40 repeat pinusRadiata_005595 Q9FTT9 PUTATIVE 0 320 459
protein DKFZP564O0463
PROTEIN
192 WD40 repeat pinusRadiata_005754 Q94JT6 At1g78070/F28K19_28SPTREMBL e−168 294 451
protein
193 WD40 repeat pinusRadiata_006463 GBLP_MEDSA Guanine e−152 261 324
protein nucleotide-
binding
protein beta
subunit-like . . .
538 e−152
194 WD40 repeat pinusRadiata_006665 AAM20553 HYPOTHETICAL 0 655 1169
protein 119.9 KDA
PROTEIN.
1229 0.0
195 WD40 repeat pinusRadiata_006750 AAM13119 HYPOTHETICAL e−158 264 312
protein 35.4 KDA
PROTEIN. 560
e−158
196 WD40 repeat pinusRadiata_007030 Q9LJN8 MITOTIC e−169 284 335
protein CHECKPOINT
PROTEIN. 595
e−169
197 WD40 repeat pinusRadiata_007854 Q8H919 Putative WD 0 429 644
protein domain
containing
protein
198 WD40 repeat pinusRadiata_007917 AAD10151 PUTATIVE WD- 0 353 462
protein 40 REPEAT
PROTEIN,
MSI4
199 WD40 repeat pinusRadiata_007989 Q9LRZ0 Genomic DNA, 0 480 687
protein chromosome
3, TAC
clone: K20I9
200 WD40 repeat pinusRadiata_008506 MSI1_LYCES WD-40 repeat 0 364 420
protein protein MSI1
201 WD40 repeat pinusRadiata_008692 Q8W403 Sec13p e−134 218 301
protein
202 WD40 repeat pinusRadiata_008693 Q8W403 Sec13p e−137 222 301
protein
203 WD40 repeat pinusRadiata_009170 Q9M0V4 U3 snoRNP- e−127 244 524
protein associated-
like
protein.
SPTREMBL
204 WD40 repeat pinusRadiata_009408 Q9SAJ0 F23A5.2(FORM e−171 282 350
protein 2). 602 e−171
205 WD40 repeat pinusRadiata_009522 Q8RXQ4 Hypothetical e−129 231 395
protein 43.8 kDa
protein
206 WD40 repeat pinusRadiata_009734 AAO27452 Peroxisomal e−142 227 317
protein targeting
signal type
2 receptor.
SPTREMBL
207 WD40 repeat pinusRadiata_009815 AAM20433 CELL CYCLE 0 326 500
protein SWITCH
PROTEIN
208 WD40 repeat pinusRadiata_010670 AAN72058 Expressed e−157 264 345
protein protein
209 WD40 repeat pinusRadiata_011297 AAM13100 WD REPEAT e−157 262 337
protein PROTEIN
ATAN11
210 WD40 repeat pinusRadiata_013098 AAM13153 HYPOTHETICAL e−136 229 352
protein 39.1 KDA
PROTEIN. 487
e−136
211 WD40 repeat pinusRadiata_013172 Q8H0T9 Hypothetical 0 437 860
protein protein
212 WD40 repeat pinusRadiata_013589 AAK52092 WD-40 REPEAT 0 448 512
protein PROTEIN
213 WD40 repeat pinusRadiata_013608 AAC27402 EXPRESSED e−141 202 358
protein PROTEIN
214 WD40 repeat pinusRadiata_014299 Q9XED5 Cell cycle 0 335 488
protein switch
proteinSPTREMBL
215 WD40 repeat pinusRadiata_014498 Q9FH64 WD REPEAT e−152 206 329
protein PROTEIN-LIKE
216 WD40 repeat pinusRadiata_014548 Q93ZS6 HYPOTHETICAL 0 505 763
protein 82.2 KDA
PROTEIN
217 WD40 repeat pinusRadiata_014610 Q9M298 Hypothetical 0 450 922
protein 104.7 kDa
protein
218 WD40 repeat pinusRadiata_016090 Q9SIY9 Putative WD- 0 442 802
protein 40 repeat
proteinSPTREMBL
219 WD40 repeat pinusRadiata_016722 O22826 Putative e−159 257 310
protein splicing
factorSPTREMBL
220 WD40 repeat pinusRadiata_016785 AAG60193 PUTATIVE 0 344 464
protein WD40 PROTEIN
221 WD40 repeat pinusRadiata_017094 Q9LV35 WD40-REPEAT 0 406 604
protein PROTEIN
222 WD40 repeat pinusRadiata_017527 Q9AYE4 Hypothetical e−154 254 314
protein 35.3 kDa
protein
223 WD40 repeat pinusRadiata_017591 O80706 F8K4.21 0 905 1218
protein protein
224 WD40 repeat pinusRadiata_017769 Q9XIJ3 T10O24.21 0 446 607
protein
225 WD40 repeat pinusRadiata_018047 Q8VZY6 FERTILIZATION- 0 285 373
protein INDEPENDENT
ENDOSPERM
PROTEIN
226 WD40 repeat pinusRadiata_018414 Q947M8 COPI 0 455 638
protein
227 WD40 repeat pinusRadiata_018986 Q9LFE2 WD40-repeat 0 518 886
protein protein
228 WD40 repeat pinusRadiata_019479 Q9SZA4 WD-repeat e−156 276 454
protein protein-like
protein
229 WD40 repeat pinusRadiata_020144 Q8W514 MSI TYPE 0 288 413
protein NUCLEOSOME/CHROMATIN
ASSEMBLY
FACTOR C
230 WD40 repeat pinusRadiata_022480 Q8W514 MSI type e−167 287 426
protein nucleosome/chromatin
assembly
factor C
231 WD40 repeat pinusRadiata_023079 Q8W514 MSI type e−169 283 397
protein nucleosome/chromatin
assembly
factor C. SPTREMBL
232 WD40 repeat pinusRadiata_026739 Q93YS7 Putative WD- 0 591 918
protein repeat
membrane
protein.
SPTREMBL
233 WD40 repeat pinusRadiata_026951 Q93VS5 AT4g18900/F13C5_70 e−163 290 503
protein (Hypothetical
protein)
234 WEE1-like pinusRadiata_026529 Q9SRY9 F22D16.3 e−122 209 451
protein PROTEIN
235 WD40 repeat eucalyptusSpp_006366 Q8LF96 PRL1 protein 0 374 492
protein
236 WD40 repeat eucalyptusSpp_017378 O22607 WD-40 repeat 0 371 453
protein protein MSI4
237 WD40 repeat pinusRadiata_000888 O22466 WD-40 repeat 0 364 420
protein protein MSI1
238 Cyclin- pinusRadiata_014166 Q9FKB5 GENOMIC DNA, 5E−42 114 304
dependant CHROMOSOME
kinase 5, TAC
inhibitor CLONE: K24G6
(CYCLIN-
DEPENDENT
239 CDK type D pinusRadiata_003189 Q9M5G4 CDK- 8E−21 56 100
activating
kinase
240 Histone pinusRadiata_009356 Q9FJT8 Histone 7E−85 187 510
acetyltransferase acetyltransferase
HAT B
241 Histone pinusRadiata_000065 Q9LPW6 F13K23.8 5E−18 71 209
deacetylase protein.
242 Histone pinusRadiata_014197 Q8GXJ1 Putative e−170 308 519
deacetylase histone
deacetylase
243 Peptidylprolyl pinusRadiata_009081 Q9ZRQ9 Cyclophilin e−106 185 190
isomerase (EC 5.2.1.8)
(Peptidyl-
prolyl cis-
trans
244 Peptidyprolyl pinusRadiata_013417 Q8H4T0 Putative e−140 235 345
isomerase peptidyl-
prolycis-
trans
isomerase
protein
245 WD40 repeat pinusRadiata_005755 Q9SKW4 F5J5.6. e−143 144 319
protein
246 WD40 repeat pinusRadiata_006670 Q9LDG7 WD-40 repeat e−163 393 960
protein protein-like
(MJK13.13
protein)
247 WD40 repeat pinusRadiata_007027 Q8GWR1 Hypothetical e−157 276 470
protein protein.
248 WD40 repeat pinusRadiata_007276 Q9LF27 Hypothetical e−138 235 428
protein 47.3 kDa
protein
249 WD40 repeat pinusRadiata_007390 Q94AH4 PUTATIVE 3E−17 53 158
protein RING ZINC
FINGER
PROTEIN. 91
3e−17
250 WD40 repeat pinusRadiata_012648 O22212 Hypothetical 0 324 561
protein 61.8 kDa
Trp-Asp
repeats
containing
protein
251 WD40 repeat pinusRadiata_013171 Q8H0T9 Hypothetical 0 437 860
protein protein.
252 Cyclin B eucalyptusSpp_045414 Q9LDM4 F2D10.10 e−142 255 423
(F5M15.6)
253 Cyclin- eucalyptusSpp_044328 Q9FKB5 GENOMIC DNA, 1E−54 121 260
dependant CHROMOSOME
kinase 5, TAC
inhibitor CLONE: K24G6
(CYCLIN-
DEPENDENT
254 Histone eucalyptusSpp_015615 Q9AR19 Histone 0 390 563
acetyltransferase acetyltransferase
GCN5
(Expressed
protein)
255 Peptidylprolyl eucalyptusSpp_017239 Q8GWM6 Hypothetical 0 364 591
isomerase protein
256 WD40 repeat eucalyptusSpp_018643 Q93VS5 AT4g18900/F13C5_70 0 229 327
protein (Hypothetical
protein)
257 WD40 repeat eucalyptusSpp_019127 Q9SRX9 F22D16.14 e−131 232 337
protein protein.
SPTREMBL
258 WD40 repeat eucalyptusSpp_022624 Q9LFE2 WD40-repeat 0 594 868
protein protein
259 WD40 repeat eucalyptusSpp_032424 Q8LPL5 Cell cycle 0 255 327
protein switch
protein
260 WD40 repeat eucalyptusSpp_037472 Q9SK69 Putative WD- 0 461 677
protein 40 repeat
protein
(AT2G20330/F11A3.12)

Claims

1. An isolated polynucleotide comprising a nucleic acid sequence that (i) is selected from the group consisting of SEQ ID NOs: 1-260 and variants thereof, (ii) is selected from the group consisting of SEQ ID NOs: 521-772 and variants thereof, or (iii) encodes the catalytic or substrate-binding domain of a polypeptide selected from of any one of SEQ ID NOs: 261-520, wherein the polynucleotide encodes a polypeptide having the activity of said polypeptide selected from any one of SEQ ID NOs: 261-520.

2.-5. (canceled)

6. The isolated polynucleotide of claim 1, wherein the variant has a sequence identity that is greater than or equal to 80% to any one of SEQ ID NOs: 1-260 or encodes a protein with an amino acid sequence having a sequence identity that is greater than 60%, 65%, 70%, 75%, 80%, 85% or 90% to any one of SEQ ID NOs: 261-520, and wherein the protein encoded by the polynucleotide possesses the activity of the protein encoded by said any one of SEQ ID NOs: 1-260.

7.-8. (canceled)

9. A DNA construct comprising at least one polynucleotide of claim 1, operably linked in sense or antisense orientation to a promoter, wherein the promoter is selected from the group consisting of a constitutive promoter, a strong promoter, an inducible promoter, a regulatable promoter, a temporally regulated promoter, and a tissue-preferred promoter.

10.-13. (canceled)

14. The DNA construct of claim 9, wherein an RNA transcript of the polynucleotide is complementary to a nucleic acid sequence selected from the group consisting of 1-260.

15. A plant cell, comprising the DNA construct of claim 9.

16. The plant cell of claim 15, wherein the plant cell is in a transgenic plant, and wherein the phenotype of the plant is different from a plant of the same species which does not comprise the plant cell, wherein the difference in phenotype is in lignin quality, lignin structure, wood composition, wood appearance, wood density, wood strength, wood stiffness, cellulose polymerization, fiber dimensions, lumen size, other plant components, plant cell division, plant cell development, number of cells per unit area, cell size, cell shape, cell wall composition, rate of wood formation, aesthetic appearance of wood, formation of stem defects, average microfibril angle, width of the S2 cell wall layer, rate of growth, rate of root formation, ratio of root to branch vegetative development, leaf area index, and leaf shape.

17.-20. (canceled)

21. The transgenic plant of claim 16, wherein the plant is of a species of Eucalyptus or Pinus.

22. The transgenic plant of claim 16, wherein the plant exhibits one or more traits selected from the group consisting of increased drought tolerance, herbicide resistance, reduced or increased height, reduced or increased branching, enhanced cold and frost tolerance, improved vigor, enhanced color, enhanced health and nutritional characteristics, improved storage, enhanced yield, enhanced salt tolerance, enhanced resistance of the wood to decay, enhanced resistance to fungal diseases, altered attractiveness to insect pests, enhanced heavy metal tolerance, increased disease tolerance, increased insect tolerance, increased water-stress tolerance, enhanced sweetness, improved texture, decreased phosphate content, increased germination, increased micronutrient uptake, improved starch composition, improved flower longevity, production of novel resins, and production of novel proteins or peptides, reduced period of juvenility, an increased period of juvenility, propensity to form reaction wood, self-abscising branches, accelerated reproductive development or delayed reproductive development as compared to a plant of the same species that has not been transformed with the DNA construct.

23.-31. (canceled)

32. A wood obtained from a transgenic tree which has been transformed with the DNA construct of claim 9.

33. A wood pulp obtained from a transgenic tree which has been transformed with the DNA construct of claim 9.

34.-36. (canceled)

37. An isolated polypeptide comprising an amino acid sequence encoded by the isolated polynucleotide of claim 1.

38.-43. (canceled)

44. The isolated polynucleotide of claim 1, wherein the polynucleotide comprises fewer than about 100 nucleotide bases.

45. A method of correlating gene expression in two different samples, comprising: detecting a level of expression of one or more genes encoding a product encoded by a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 1-260 and conservative variants thereof in a first sample; detecting a level of expression of the one or more genes in a second sample; comparing the level of expression of the one or more genes in the first sample to the level of expression of the one or more genes in the second sample; and correlating a difference in expression level of the one or more genes between the first and second samples.

46. The method of claim 45, wherein the first sample and the second sample are plant tissues that are from the same or different plant.

47. The method of claim 4, wherein the first sample and the second sample are (i) from the same plant tissue, (ii) harvested during a different season of the year, and/or (iii) obtained from plants in different stages of development.

48.-50. (canceled)

51. The method of claim 46 wherein the plant tissue is selected from the group consisting of vascular tissue, apical meristem, vascular cambium, xylem, phloem, root, flower, cone, fruit, and seed.

52. The method of claim 51, wherein the plant tissues are obtained from at least one of (i) a different type of tissue, (ii) a different stage of development, or (iii) different stages of the cell cycle.

53.-54. (canceled)

55. The method of claim 51, wherein the plant tissues are from one or more species of Eucalyptus or Pinus.

56. (canceled)

57. The method of claim 45, wherein the step of detecting is effected using one or more polynucleotides capable of hybridizing to a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 1-260 under standard hybridization conditions.

58. (canceled)

59. The method of claim 57, wherein the step of detecting is accomplished by hybridization to a labeled nucleic acid.

60. (canceled)

61. The method of claim 57, wherein at least one of polynucleotides hybridizes to a 3′ untranslated region of the nucleic acid sequence.

62. (canceled)

63. The method of claim 57, wherein the one or more polynucleotides comprises a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 521-772.

64.-66. (canceled)

67. The method of claim 45, further comprising, prior to the detecting steps, the step of amplifying at least one of the genes.

68. The method of claim 45, further comprising, prior to the detecting steps, the step of labeling at least one of the genes with a detectable label.

69. A combination for detecting expression of one or more genes, comprising two or more oligonucleotides, wherein each oligonucleotide is capable of hybridizing to a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 1-260 or to an RNA transcript of a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 1-260.

70. (canceled)

71. The combination of claim 69, wherein the oligonucleotides each hybridize to different nucleic acid sequences or to different RNA transcripts.

72. (canceled)

73. The combination of claim 69, wherein at least one of the oligonucleotides hybridizes to a 3′ untranslated region of a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 1-260.

74.-75. (canceled)

76. The combination of claim 69, wherein at least one of the oligonucleotides comprises a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 521-772.

77.-83. (canceled)

84. The combination of claim 69, comprising from about 2 to about 5000 oligonucleotides.

85. The combination of claim 84, wherein each of the oligonucleotides is labeled with a detectable label.

86. A microarray comprising the combination of claim 69 provided on a solid support, wherein each of the oligonucleotides occupies a unique location on said solid support.

87. (canceled)

88. A method for detecting one or more nucleic acid sequences in a sample, comprising contacting the sample with the combination of claim 69.

89.-91. (canceled)

92. The method of claim 88, wherein at least one of the oligonucleotides hybridizes to a 3′ untranslated region of a gene that comprises the nucleic acid sequence of at least any one of SEQ ID NOs 1-260.

93.-103. (canceled)

104. A kit for detecting gene expression comprising the microarray of claim 86 and one or more buffers or reagents for a nucleotide hybridization reaction.

Resources

Images & Drawings included:

Sources:

Similar patent applications:

Recent applications in this class: