🔗 Permalink

Patent application title:

Novel gene and protein encoded by the gene

Publication number:

US20060063152A1

Publication date:

2006-03-23

Application number:

10/204,639

Filed date:

2001-12-20

Abstract:

Novel DNAs containing the regions which encode proteins have been directly cloned from cDNA libraries derived from the human adult whole brain, the human adult hippocampus and the human embryonic whole brain, the nucleotide sequences thereof have been determined, and their functions have been identified. The present invention provides DNA which comprises the nucleotide sequence encoding the following polypeptide (a) or (b): (a) a polypeptide comprising an amino acid sequence which is identical or substantially identical to an amino acid sequence represented by any one of SEQ ID NOS: 1 to 70; (b) a polypeptide comprising an amino acid sequence derived from the amino acid sequence represented by any one of SEQ ID NOS: 1 to 70 by deletion, substitution or addition of a section of amino acids, and having biological activity which is substantially the same characteristic with the function of the polypeptide of (a); a recombinant polypeptide, which is encoded by the above DNA; and a protein containing the polypeptide.

Inventors:

Osamu OHARA 19 🇯🇵 Chiba, Japan
Takahiro Nagase 17 🇯🇵 Chiba, Japan
Daisuke Nakajima 4 🇯🇵 Chiba, Japan

Interested in similar patents?

Get notified when new applications in this technology area are published.

Create Free Alert

Classification:

A61P13/08 » CPC further

Drugs for disorders of the urinary system of the prostate

A61P15/00 » CPC further

Drugs for genital or sexual disorders ; Contraceptives

A61P35/00 » CPC further

Antineoplastic agents

A61K2039/505 » CPC further

Medicinal preparations containing antigens or antibodies comprising antibodies

C12Q1/68 IPC

Measuring or testing processes involving enzymes, nucleic acids or microorganisms ; Compositions therefor; Processes of preparing such compositions involving nucleic acids

C07H21/02 IPC

Compounds containing two or more mononucleotide units having separate phosphate or polyphosphate groups linked by saccharide radicals of nucleoside groups, e.g. nucleic acids with ribosyl as saccharide radical

C12P21/06 IPC

Preparation of peptides or proteins produced by the hydrolysis of a peptide bond, e.g. hydrolysate products

C12M1/34 IPC

Apparatus for enzymology or microbiology Measuring or testing with condition measuring or sensing means, e.g. colony counters

C07K14/47 » CPC main

Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals

C07K16/18 IPC

Immunoglobulins [IGs], e.g. monoclonal or polyclonal antibodies against material from animals or humans

Description

TECHNICAL FIELD

The present invention relates to DNA and a gene containing the DNA, and a recombinant polypeptide encoded by the DNA and a novel recombinant protein containing the polypeptide.

BACKGROUND ART

An enormous amount of information on the nucleotide sequence of the human genome has been obtained by large-scale sequencing in the Human Genome Project and analysis of the information is continuing on a daily basis.

The ultimate goal of the Human Genome Project is not just simple determination of the entire nucleotide sequence of the genome, but also the elucidation of various human life phenomena based on the structural information, that is the nucleotide sequence information of DNA.

Only limited regions of the human genome sequence encode proteins. Currently, the coding regions are predicted by the neural network or an information science technique, called the Hidden Markov Model. However, these models' predictive abilities are not yet sufficiently reliable.

DISCLOSURE OF THE INVENTION

For the purpose of finding novel genes, we have completed the present invention by succeeding in directly cloning novel DNAs comprising regions that encode proteins from cDNA libraries derived from the human adult whole brain, the human adult hippocampus and the human embryonic whole brain, and determining the nucleotide sequences thereof.

In a first embodiment, the present invention relates to DNA comprising a nucleotide sequence encoding the following (a) or (b):

(a) a polypeptide consisting of an amino acid sequence which is identical or substantially identical to an amino acid sequence represented by any one of SEQ ID NOS: 1 to 70;
(b) a polypeptide consisting of an amino acid sequence derived from an amino acid sequence represented by any one of SEQ ID NOS: 1 to 70 by deletion, substitution or addition of a section of amino acid(s), and having biological activity which is substantially the same characteristic with the function of the polypeptide of (a). Examples of such DNA include, but are not limited to, DNAs comprising the nucleotide sequences of SEQ ID NOS: 71 to 140.

In a second embodiment, the present invention further relates to a DNA hybridizing to the DNA of the first embodiment of the present invention under stringent conditions, and encoding a polypeptide having biological activity which is substantially the same characteristic with the function of the polypeptide of (a) above.

Hereinafter, the DNAs of the first and the second embodiments of the present invention are together referred to as “the DNA of the present invention.” Further, the present invention also relates to antisense DNA comprising a nucleotide sequence which is substantially complementary to the DNA of the present invention.

In a third embodiment, the present invention relates to a gene construct containing the DNA of the present invention. The term “gene construct” in the present specification refers to every artificially-engineered gene. Examples of the gene construct include, but are not limited to, a vector containing the DNA of the present invention or the antisense DNA of the DNA of the present invention, and an expression vector of the DNA of the present invention.

In a fourth embodiment, the present invention relates to the following (a) or (b):

(a) a polypeptide, consisting of an amino acid sequence which is identical or substantially identical to an amino acid sequence represented by any one of SEQ ID NOS: 1 to 70;
(b) a polypeptide, consisting of an amino acid sequence derived from the amino acid sequence represented by any one of SEQ ID NOS: 1 to 70 by deletion, substitution or addition of a section of amino acids, and having biological activity which is substantially the same characteristic with the function of the polypeptide of (a).

In a fifth embodiment, the present invention relates to a recombinant polypeptide encoded by the gene construct of the third embodiment of the present invention.

Hereinafter, the above polypeptides are together also referred to as “the polypeptide of the present invention.” The term “polypeptide” in the present specification refers to “polymers of amino acids having every molecular weight.” The present invention also relates to a recombinant protein containing the polypeptide of the present invention. As defined above, in the present specification the term “polypeptide” is not to be limited by molecular weight, and therefore the term “the polypeptide of the present invention” also includes a recombinant protein containing the polypeptide of the present invention.

In a sixth embodiment, the present invention relates to an antibody against the polypeptide of the present invention.

In a seventh embodiment, the present invention relates to a DNA chip on which the DNAs of the present invention are arrayed.

In an eighth embodiment, the present invention relates to a polypeptide chip on which the polypeptides of the present invention are arrayed.

In a ninth embodiment, the present invention relates to an antibody chip on which the antibodies of the sixth embodiment of the present invention are arrayed.

Table 1 shows the names of clones having the DNA of the present invention, lengths of the polypeptide of the present invention, their putative functions and grounds for prediction.

The DNAs of the present invention are identified by determining the nucleotide sequences after isolating them as cDNA fragments from cDNA libraries that we have prepared using as a starter material the commercially available (Clontech) mRNA of human adult whole brain, the human adult hippocampus and the human embryonic whole brain.

Specifically, clones are randomly isolated from cDNA libraries derived from the human adult whole brain, the human adult hippocampus and the human embryonic whole brain prepared according to the method of Ohara et al. (DNA Research 4:53-59 (1997)).

Next, redundant clones (clones containing the same sequences) are removed by hybridization, followed by in vitro transcription and translation. Both termini of its nucleotide sequence are determined for a clone that has been confirmed to have products of 50 kDa or more.

Homology searches are performed with databases of known genes using the thus obtained terminal nucleotide sequences as queries. As a result, the full-length nucleotide sequence of a clone that is shown to be new is determined.

As described above, unknown genes that cannot be obtained by standard cloning techniques, which rely on known genes, can now be cloned systematically.

Further, the entire region of a human-derived gene containing the DNA of the present invention can also be prepared by a PCR method, such as RACE, while exercising proper care so as not to cause short fragments or any artificial mistakes in obtained sequences.

Furthermore, the present invention provides a recombinant vector which comprises the DNA of the present invention or a gene construct containing the DNA of the present invention; a transformant retaining the recombinant vector; a method for producing the polypeptide of the present invention or a recombinant protein containing the polypeptide, or salts thereof, which comprises the steps of culturing the transformant, producing and accumulating the polypeptide of the present invention or the recombinant protein containing the polypeptide, and collecting these products; and the thus produced polypeptide of the present invention or the recombinant protein containing the polypeptide, or salts thereof.

The present invention also relates to a pharmaceutical preparation comprising the DNA of the present invention or the gene construct; a pharmaceutical preparation comprising a polynucleotide (DNA) comprising a nucleotide sequence which encodes the polypeptide of the present invention or a partial polypeptide thereof, or a recombinant protein containing the polypeptides, an antisense nucleotide comprising a nucleotide sequence substantially complementary to the nucleotide sequence which encodes the polypeptide of the present invention or a partial polypeptide thereof, or a recombinant protein containing the polypeptides; a pharmaceutical preparation comprising the polynucleotide of the present invention and the antisense nucleotide , and a pharmaceutical preparation comprising the polypeptide of the present invention or a partial polypeptide thereof, and a recombinant protein containing the polypeptides.

The present invention further relates to a DNA chip, a peptide chip and an antibody chip that are prepared by arraying the DNAs of the present invention, the polypeptides of the present invention and the antibodies against the polypeptide of the present invention, respectively.

The present invention further relates to an antibody against the polypeptide of the present invention or a partial polypeptide thereof or a recombinant protein containing the polypeptides, or against salts thereof, and a method for screening a substance which specifically interacts with the polypeptide of the present invention by using the polypeptide of the present invention, a partial polypeptide thereof or a recombinant protein containing the polypeptides, or salts thereof, or antibodies against these substances; a kit for screening; and the substance (compound) itself which is identified by the screening method.

Any DNA can be used as the DNA of the present invention, so far as it comprises a nucleotide sequence encoding the above-mentioned polypeptide of the present invention. Further, the DNA of the present invention may be cDNA identified and isolated from cDNA libraries or the like derived from the human brain, from cells or tissues other than the brain, such as the heart, lung, liver, spleen, kidney, and testicle, or synthetic DNA.

A vector used for constructing libraries may be a bacteriophage, a plasmid, a cosmid, or a phagemid. In addition, using total RNA fractions or mRNA fractions prepared from the above cells or tissues, amplification can be performed directly by a reverse transcriptase-polymerase chain reaction (hereinafter, abbreviated as “RT-PCR method.”).

Any antisense DNA may be used as an antisense oligonucleotide (DNA) having a nucleotide sequence substantially complementary to the DNA that encodes the polypeptide of the present invention or a partial polypeptide thereof, so far as it comprises a nucleotide sequence substantially complementary to the nucleotide sequence of the DNA, and is capable of inhibiting the expression of the DNA. A substantially complementary sequence is, for example, a nucleotide sequence having preferably about 90% or more, more preferably about 95% or more, and most preferably 100% homology with the full-length or partial nucleotide sequence of the nucleotide sequence complementary to the DNA of the present invention. The antisense DNA of the present invention includes a nucleic acid sequence (RNA or DNA modified) having a similar function to that of the antisense DNA. These antisense DNAs can be produced using a known DNA synthesizer or the like.

The term “an amino acid sequence substantially identical to an amino acid sequence represented by any one of SEQ ID NOS: 1 to 70” refers to an amino acid sequence having on the overall average about 70% or more, preferably about 80% or more, further preferably about 90% or more, and particularly preferably about 95% or more homology with each of all the amino acid sequences represented by any one of SEQ ID NOS: 1 to 70.

An example of a polypeptide consisting of an amino acid sequence substantially identical to an amino acid sequence represented by any one of SEQ ID NOS: 1 to 70 of the present invention is a polypeptide having the above homology with the amino acid sequence represented by each of the above SEQ ID NOS, and having biological activity (function) which is substantially the same characteristic with the function of the polypeptide comprising the amino acid sequence represented by each SEQ ID NO. The term “substantially the same characteristic” refers to the activity (function) having the same characteristics.

Further, the polypeptide of the present invention also includes, for example, a polypeptide consisting of an amino acid sequence derived from an amino acid sequence represented by any one of SEQ ID NOS: 1 to 70 by deletion, substitution or addition of a section of amino acids (preferably about 1 to 20, more preferably about 1 to 10, and further preferably several amino acids) or by a combination of these, and having biological activity (function) which is substantially the same characteristic with the function of a polypeptide comprising an amino acid sequence represented by any one of SEQ ID NOS: 1 to 70.

The polypeptide consisting of an amino acid sequence which is substantially identical to the above amino acid sequence represented by any one of SEQ ID NOS: 1 to 70, or the polypeptide comprising an amino acid sequence derived from the above amino acid sequence by deletion, substitution or addition of a section of the amino acids can be easily produced by, for example, an appropriate combination of methods known by a person skilled in the art, such as site-directed mutagenesis, homologous recombination of genes, primer elongation and PCR.

For the polypeptide to have biological activity which is substantially the same characteristics, a possible method is substitution between homologous amino acids (polar or nonpolar amino acids, hydrophobic or hydrophilic amino acids, positively or negatively charged amino acids, aromatic amino acids and the like) among amino acids composing the polypeptide. To maintain biological activity that is substantially the same characteristics, it is preferred to retain amino acids within functional domains contained in each polypeptide of the present invention.

Further, the DNA of the present invention includes DNA comprising a nucleotide sequence encoding an amino acid sequence represented by any one of SEQ ID NOS: 1 to 70, and a DNA hybridizing to the DNA under stringent conditions, and encoding a polypeptide having biological activity (function) which is the same characteristic with the function of a polypeptide consisting of an amino acid sequence represented by each of the sequences.

Under such conditions, examples of DNA capable of hybridizing to DNA comprising a nucleotide sequence encoding an amino acid sequence represented by each of the nucleotide sequences of SEQ ID NOS: 1 to 70 include DNA comprising a nucleotide sequence having on the overall average about 80% or more, preferably about 90% or more, more preferably about 95% or more homology with each of all the nucleotide sequences of the DNAs.

Hybridization can be performed by a method known in the art or a method according to any known methods, such as a method described in Current Protocols in Molecular Biology (edited by Frederick M. Ausubel et al., 1987). When a commercially available library is used, hybridization can also be performed by the method described in the attached instructions.

The term “stringent conditions” means, for example, conditions that allow hybridizing to the DNA probe of the present invention by southern blot hybridization under conditions that involve hybridization in an 7% SDS solution containing 1 mM sodium EDTA and 0.5 M dibasic sodium phosphate (pH 7.2) at 65° C., and washing membranes in a 1% SDS solution containing 1 mM sodium EDTA and 40 mM dibasic sodium phosphate (pH 7.2) at 65° C. The same stringency can also be achieved by conditions other than the above conditions.

To clone the DNA of the present invention, amplification is performed by a PCR method using a synthetic DNA primer having an appropriate nucleotide sequence of a part of the polypeptide of the present invention or the like, or the DNA can be selected by hybridization of DNA incorporated in an appropriate vector with DNA labeled using a DNA fragment or synthetic DNA which encodes a section or the full-length region of the polypeptide of the present invention.

Hybridization can be performed according to, for example, the above-described method in “Current Protocols in Molecular Biology” (edited by Frederick M. Ausubel et al., 1987). In addition, when commercially available libraries are used, hybridization can be performed according to the method described in the attached instructions.

Cloned DNA encoding a polypeptide can be used intact, or can be used after digestion with restriction enzymes if necessary, or after addition of linkers thereto, depending on the purposes. The DNA may contain ATG as a translation initiating codon at the 5′ terminal side, or TAA, TGA or TAG as a translation termination codon at the 3′ terminal side. These translation initiating and termination codons may be added using an appropriate synthetic DNA adaptor.

An expression vector for the polypeptide of the present invention can be produced according to any method known in the technical field. For example, the vector can be produced by (1) cleaving a DNA fragment containing the DNA of the present invention or a gene having the DNA of the present invention, and (2) ligating the DNA fragment downstream of a promoter in an appropriate expression vector.

Examples of vectors that can be used herein include plasmids derived from Escherichia coli, (for example, pBR322, pBR325, pUC18 and pUC118), plasmids derived from Bacillus subtilis (for example, pUB110, pTP5 and pC194), plasmids derived from yeast (for example, pSH19 and pSH15), bacteriophages, such as λ phages, and animal viruses, such as retrovirus, vaccinia virus, baculovirus and the like.

Any promoter can be used in the present invention, so far as it is appropriate for a host to be used for gene expression. Preferred examples of promoters include, when the host is Escherichia coli, trp promoters, lac promoters, recA promoters, λPL promoters and lpp promoters; when the host is Bacillus subtilis, SPO1 promoters, SPO2 promoters and penP promoters; and when the host is yeast, PHO5 promoters, PGK promoters, GAP promoters and ADH promoters. When animal cells are used as hosts, examples of promoters include SRα promoters, SV40 promoters, LTR promoters, CMV promoters and HSV-TK promoters.

In addition to the above substances, an enhancer, splicing signal, polyA addition signal, a selection marker, SV40 replication origin and the like that are known in the technical field can be added to the expression vector, if desired. Further, if necessary, a protein encoded by the DNA of the present invention can be expressed as a fusion protein with another protein (for example, glutathione S transferase and protein A). Such a fusion protein can be cleaved with appropriate protease and then separated into each protein.

Examples of host cells that are used herein include bacteria of the genus Escherichia or the genus Bacillus, yeast, insect cells, insects, and animal cells.

Specific examples of bacteria of the genus Escherichia that are used herein include Escherichia coli K12/DH1 (Proc. Natl. Acad. Sci. USA, 60:160 (1968)), JM103 (Nucleic Acids Research, 9:309 (1981)), JA221 (Journal of Molecular Biology, 120:517 (1978)), and HB 101 (Journal of Molecular Biology, 41:459 (1969)).

Examples of bacteria of the genus Bacillus that are used herein include Bacillus subtilis MI114 (Gene, 24:255(1983)) and 207-21 [Journal of Biochemistry, 95:87 (1984)].

Examples of yeast that are used herein include Saccaromyces, such as Saccaromyces cerevisiae AH22, AH22R-, NA87-11A, DKD-5D and 20B-12; Schizosaccaromyces pombe NCYC1913 and NCYC2036; and Pichia pastoris.

Examples of animal cells that are used herein include monkey cells, such as COS-7 and Vero, Chinese hamster ovary cells, such as CHO (hereinafter, abbreviated as CHO cells), dhfr gene-deficient CHO cells; mouse L cells, mouse AtT-20, mouse myeloma cells, rat GH3, and human FL cells.

These host cells can be transformed according to a method known in the technical field. For example, transformation can be performed by referring to Proc. Natl. Acad. Sci. USA, 69:2110 (1972); Gene, 17:107 (1982); Molecular & General Genetics, 168:111 (1979); “Methods in Enzymology,” vol. 194, p 182-187 (1991); Proc. Natl. Acad. Sci. USA, 75:1929 (1978); A Separate Volume 8 of Cell Technology, New Experimental Protocols in Cell Technology, p 263-267 (1995) (issued by Shujunsha); and Virology, 52:456 (1973).

The thus obtained transformant, which has been transformed with an expression vector containing the DNA of the present invention or a gene containing the DNA of the present invention, can be cultured according to a method known in the technical field.

For example, when hosts are bacteria of the genus Escherichia, culturing is performed normally at about 15° C. to 43° C. for about 3 to 24 hours, and if necessary, aeration and agitation may be performed. When hosts are bacteria of the genus Bacillus, culturing is performed normally at about 30° C. to 40° C. for about 6 to 24 hours, and if necessary, aeration and agitation may be performed.

A transformant whose host is yeast is normally cultured using media adjusted to have pH of approximately 5 to 8, at about 20° C. to 35° C. for about 24 to 72 hours, and if necessary, aeration and agitation may be performed.

A transformant whose host is an animal cell is normally cultured using media adjusted to have pH of about 6 to 8, at about 30° C. to 40° C. for about 15 to 60 hours, and if necessary, aeration and agitation may be performed.

To isolate and purify the polypeptide or the protein of the present invention from the above culture product, for example, bacteria or cells are collected by a known method after culturing, suspended in an appropriate buffer, disrupted by ultrasonication, lysozyme and/or freezing and thawing, and then centrifuged or filtered, thereby obtaining a crude protein extract. The buffer may contain a protein denaturing agent, such as urea or guanidine hydrochloride, or a surfactant, such as Triton X-100 (trade-mark). When the protein is secreted in a culture solution, bacteria or cells are separated after culturing from the supernatant by a known method, thereby collecting the supernatant. The thus obtained culture supernatant or the protein contained in an extract can be purified by an appropriate combination of known isolation and purification methods.

The thus obtained polypeptide of the present invention can be converted to a salt by a known method or a method according to the known method. Conversely, when the polypeptide is obtained as a salt, it can be converted to an educt or another salt by a known method or a method according to the known method. Further before or after purification, the protein produced by a recombinant can be freely modified by allowing an appropriate protein modification enzyme, such as trypsin and chymotrypsin, to act on the protein, or polypeptides can be partially removed.

The presence of the polypeptide of the present invention or its salt can be measured by various binding assays and enzyme immunoassay using a specific antibody.

The C terminus of the polypeptide of the present invention is normally a carboxyl group (—COOH) or a carboxylate (—COO—), and the C terminus may be an amide (—CONH₂) or ester (—COOR). Here, examples of R in ester that are used herein include a C1-6 alkyl group, such as methyl, ethyl, n-propyl, isopropyl or n-butyl; a C3-8 cycloalkyl group, such as cyclopentyl or cyclohexyl; a C6-12 aryl group, such as phenyl or α-naphthyl; a phenyl-C1-2 alkyl group, such as benzyl or phenethyl; and a C7-14 aralkyl group, such as an α-naphthyl-C1-2 alkyl group, e.g., α-naphthyl methyl. Further, pivaloyl-oxymethylester which is generally used as oral administration may also be used.

When the polypeptide of the present invention has a carboxyl group (or carboxylate) at the terminus other than the C terminus, the carboxyl group is amidated or esterified. The polypeptide of the present invention encompasses such a polypeptide. An example of ester that is used in this case is the above-mentioned ester at the C-terminus. Moreover, the polypeptide of the present invention also encompasses a polypeptide wherein an amino group of a methionine residue at the N-terminus is protected with a protecting group (for example, a C1-6 acyl group, such as a formyl group or an acetyl group); a polypeptide wherein a glutamic acid residue at the N-terminus which is generated by in vivo cleavage is pyroglutamated; a polypeptide wherein OH, COOH, NH₂, SH and the like on the side chain of intramolecular amino acids are protected with appropriate protecting groups (for example, a C1-6 acyl group, such as a formyl group and an acetyl group); or a complex protein, such as a so-called glycoprotein formed by the binding of sugar chains.

A partial polypeptide of the polypeptide of the present invention may be any polypeptide which is a partial peptide of the above-mentioned polypeptide of the present invention and has activity which has substantially the same characteristics. For example, a peptide that is used herein comprises a sequence of at least 10 or more, preferably 50 or more, further preferably 70 or more, further more preferably 100 or more, and most preferably 200 or more amino acids of the amino acid sequence composing the polypeptide of the present invention, and, for example, has biological activity substantially the same characteristic with the function of the polypeptide of the present invention. An example of a preferable partial polypeptide of the present invention contains each functional domain. Further, the partial peptide of the present invention normally has a carboxyl group (—COOH) or a carboxylate (—COO—) at the C terminus, and it may also have an amide (—CONH₂) or an ester (—COOR) at the C terminus like the above polypeptide of the present invention may have. Further, examples of the partial peptide of the present invention, similar to the polypeptide of the present invention described above, include a peptide wherein an amino group of a methionine residue at the N terminus is protected with a protecting group; a peptide wherein a glutamyl residue at the N-terminus which is generated by in vivo cleavage is pyroglutamated; a peptide wherein a substituent on the side chain of intramolecular amino acids is protected with an appropriate protecting group; a complex peptide, such as a so-called glycopeptide formed by the binding of sugar chains, or the like. The partial peptide of the present invention can be used as, for example, a reagent, reference materials for experiments, or an immunogen or a portion thereof.

Particularly preferred salts of the polypeptide of the present invention or the partial peptide are physiologically acceptable acid-added salts. Examples of such salts that are used herein include a salt formed with inorganic acid (for example, hydrochloric acid, phosphoric acid, hydrobromic acid and sulfuric acid), and a salt formed with organic acid (for example, acetic acid, formic acid, propionic acid, fumaric acid, maleic acid, succinic acid, tartaric acid, citric acid, malic acid, oxalic acid, benzoic acid, methane sulfonic acid and benzenesulfonic acid).

The polypeptide of the present invention, the partial peptide thereof or salts thereof, or amides thereof can be prepared by a chemical synthesis method known in the technical field.

For example, amino acids whose α-amino groups and side chain functional groups are appropriately protected are condensed on resin (which is commercially available resin for protein synthesis) in accordance with the sequence of a target polypeptide, according to various condensation methods known in the art. Various protecting groups are then removed simultaneously with cleavage of the polypeptide from the resin at the end of reaction. Further, reaction for forming an intramolecular disulfide linkage is conducted in a highly diluted solution, thereby obtaining a target polypeptide, the partial peptide thereof or amides thereof. Examples of activation reagents that can be used to condense the above protected amino acids include those that can be used for polypeptide synthesis and are represented by carbodiimides, such as DCC, N,N′-diisopropylcarbodiimide and N-ethyl-N′-(3-dimethylaminopropyl) carbodiimide. For activation by such reagents, both protected amino acids and a racemization-suppressing additive (for example, HOBt or HOOBt) are directly added to the resin; or protected amino acids can be previously activated with acid anhydride as a control, or HOBt ester or HOOBt ester, and then added to the resin.

Solvents used for activation of protected amino acids and condensation with resin can be appropriately selected from solvents known in the art as applicable to polypeptide condensation reaction, such as acid amides, halogenated hydrocarbons, alcohols, sulfoxides and ethers. A reaction temperature is appropriately selected from a known range that can be used for reaction of polypeptide linkage formation. Activated amino acid derivatives are normally used in 1.5 to 4-fold excess. When condensation is insufficient as a result of a test using ninhydrin reaction, sufficient condensation can be performed by repeating condensation reaction without eliminating protecting groups. When condensation is still insufficient even when reaction is repeated, unreacted amino acids are acetylated using acetic anhydride or acetylimidazole so as not to affect the subsequent reaction.

Protecting groups which are normally employed in the technical field can be used for raw materials, such as those for each of amino groups, carboxyl groups and serine hydroxyl groups.

The protection of functional groups that should not involve the reaction of raw materials, protecting groups, and the elimination of the protecting groups, and the activation of functional groups that involve reaction and the like can be appropriately selected from known groups or performed by known measures.

The partial peptide of the present invention or a salt thereof can be produced according to a peptide synthesis method known in the technical field, or by cleaving the polypeptide of the present invention with appropriate peptidase. For example, the peptide synthesis method may be either a solid-phase synthesis method or a liquid phase synthesis method. Examples of a known condensation method and a method of elimination of protecting groups are described in Nobuo IZUMIYA et al., Basics and Experiment for Peptide Synthesis, Maruzen (1975); Haruaki YAJIMA and Shunpei SAKAKIBARA, Experiment Course for Biochemistry 1, Protein Chemistry IV, 205 (1977); and Development of Pharmaceutical Preparation 2, vol. 14, Peptide Synthesis, under the editorship of Haruaki YAJIMA, Hirokawa Publishing Co.

After reaction, the partial peptide of the present invention can be purified and isolated using known methods, such as solvent extraction, distillation, column chromatography, liquid chromatography, recrystallization and the like in combination. When the partial peptide obtained by the above methods is an educt, it can be converted to an appropriate salt by a known method. Conversely, when the peptide is obtained as a salt, it can be converted to an educt by a known method.

The antibody for the polypeptide of the present invention, the partial peptide thereof or salts thereof may be either a polyclonal or a monoclonal antibody, so far as it can recognize these substances. The antibody for the polypeptide of the present invention, the partial peptide thereof or salts thereof can be produced using as an antigen the polypeptide of the present invention or the partial peptide thereof according to a known method for producing antibodies or anti-serum.

The antibody of the present invention can be used to detect the polypeptide of the present invention and the like which are present in a specimen, such as body fluid, tissues or the like. In addition, the antibody can be used for preparing an antibody column to be used for purifying these substances; detecting the polypeptide of the present invention in each fraction upon purification; analyzing the behavior of the polypeptide of the present invention within the cells of a specimen; and the like.

The use of the DNA, the polypeptide and the antibody of the present invention will be further described below.

Using as a probe the DNA of the present invention, the antisense DNA of the DNA of the present invention, or a gene construct containing these DNAs, abnormalities (of the gene) in DNA or mRNA encoding the polypeptide of the present invention or the partial peptide thereof can be detected.

The DNA, the antisense DNA or the gene construct of the present invention are useful as a genetic diagnostic agent for, for example, damages, mutation or hypoexpression in the DNA or mRNA, and an increase or hyperexpression of the DNA or mRNA. The above gene diagnosis using the DNA of the present invention can be performed by, for example, a known northern hybridization or a PCR-SSCP method (Genomics, 5:874-879(1989), Proc. Natl. Acad. Sci. USA, 86:2766-2770 (1989)).

Moreover, for patients who cannot exert normal in vivo functions because of abnormalities or deletions in the DNA or the gene of the present invention, or because the expression amount of the DNA or the gene of the present invention is reduced, it is effective that the DNA or the gene construct of the present invention is introduced for expression into the bodies of the patients by gene therapy using as vehicles appropriate vectors, such as retrovirus vectors, adenovirus vectors and adenovirus-associated virus vectors according to known techniques. Further, when patients cannot exert normal functions because of an increased expression amount, introduction of antisense can be effective.

The DNA, the antisense DNA of the present invention, or the gene construct thereof can be administered alone, or in combination with an adjuvant to promote uptake using a gene gun or a catheter, such as a hydrogel catheter.

In another example, injection of the polypeptide of the present invention or the like into patients with the above diseases also enables the polypeptide of the present invention or the like to exert its function in the patients.

Furthermore, the antibody of the present invention can be used for quantitatively determining the polypeptide of the present invention in a test liquid by a known method. Specifically, the antibody of the present invention can be used for quantitative determination by a sandwich immuno-assay using monoclonal antibodies, detection by tissue staining, and the like, by which, for example, diseases that involve the polypeptide of the present invention or the like can be diagnosed.

For these purposes, an antibody molecule itself can be used, or the antibody molecules F(ab′)2, Fab′ or Fab fractions can be used. Quantitative determination methods for the polypeptide of the present invention using the antibody of the present invention are not specifically limited. Any measurement method can be used, so far as it involves detecting the amount of antibodies, antigens or antibody-antigen complexes corresponding to the amount of antigens (for example, protein amount) in a test liquid by chemical or physical means, and calculating with a calibration curve which has been prepared using a standardized solution containing a known amount of antigens. For example, nephrometry, competitive assay, immunometric assay and sandwich assay are preferably used, and a later described sandwich assay is preferred in terms of sensitivity and specificity. Examples of a labeling agent that can be used in a measurement method using a labeling substance include a substance known in the technical field, such as radioisotopes, enzymes, fluorescent materials and light-emitting materials.

Details about the general technical procedures concerning these measurement and detection methods can be referred to in a review, reference book or the like, such as Radioimmunoassay 2 edited by Hiroshi IRIE, (Kodansha, issued in 1979); Enzyme Immunoassay edited by Eiji ISHIKAWA et al., (3^rdedition; Igaku-Shoin, issued in 1987); and Methods in Enzymology (issued by Academic Press), vol. 70, “Immunochemical Techniques (Part A),” vol. 73, “Immunochemical Techniques (Part B),” vol. 74, “Immunochemical Techniques (Part C),” vol. 84, “Immunochemical Techniques (Part D: Selected Immunoassays),” vol. 92, “Immunochemical Techniques (Part E: Monoclonal Antibodies and General Immunoassay Methods),” and vol. 121, “Immunochemical Techniques (Part I: Hybridoma Technology and Monoclonal Antibodies).”

Moreover, DNA chips prepared by arraying the DNA of the present invention are useful in detecting mutations and polymorphism of the DNA of the present invention, and monitoring the DNA dynamics. Regarding DNA array, which is a type of DNA chip, see “DNA Microarray and Current PCR method” (a separate volume of Cell Technology, Genome Science Series 1, under the editorship of Masaaki MURAMATSU and Hiroyuki NABA, 1^stimpression of the first edition, issued on Mar. 16, 2000) and the like.

Further, polypeptide chip prepared by arraying the polypeptides of the present invention can be a strong tool for functional analysis on the expression, interaction and posttranslational modification of the polypeptides of the present invention, and for identification and purification of proteins.

Antibody chip prepared by arraying antibodies against the polypeptides of the present invention are very useful in analyzing the correlation between the polypeptides of the present invention and diseases, disorders, or other physiological phenomena.

Methods and materials for preparing the chip are known by persons skilled in the art.

Furthermore, the polypeptides of the present invention or the like are useful as reagents for screening compounds which interact specifically with these substances. Specifically, the present invention provides a method for screening compounds which specifically interact with the polypeptide of the present invention, a partial peptide thereof or salts thereof, or antibodies against them by using these substances; and provides the screening kit therefor.

Compounds or salts thereof that are identified using the screening method or the screening kit of the present invention are selected from the above test compounds. The compounds interact with the polypeptide of the present invention or the like. For example, the compounds regulate, inhibit, promote or antagonize the biological activity of the polypeptide of the present invention or the like. The compound or a salt thereof may directly act on the activity of the polypeptide of the present invention or the like, or indirectly act on the activity of the polypeptide of the present invention or the like by acting on the expression of the polypeptide of the present invention or the like. An example of the salt of the compound that is used herein is a pharmaceutically acceptable salt. Specific examples of such salts include a salt formed with inorganic base, a salt formed with organic base, a salt formed with inorganic acid, a salt formed with organic acid, and a salt formed with basic or acidic amino acid. Compounds which inhibit the biological activity of the polypeptide of the present invention or the like can also be used as pharmaceutical preparations, such as therapeutic agents and preventive agents for each of the above-mentioned diseases.

When nucleotides (bases) and amino acids are indicated with abbreviations in the present specification, the abbreviations follow the IUPAC-IUB Joint Commission on Biochemical Nomenclature, or those commonly used in the art. Amino acids for which optical isomerism is possible are, unless otherwise specified, in the L form.

BEST MODE FOR CARRYING OUT THE INVENTION

The present invention will now be further described by means of examples that are not intended to limit the present invention. The various gene manipulations employed in the examples are according to the methods described in the above Current Protocols in Molecular Biology (edited by Frederick M. Ausubel et al., 1987).

(1) Construction of cDNA Library Derived from Human Adult Whole Brain, Human Adult Hippocampus and Human Embryonic Whole Brain

Double-stranded cDNA was synthesized using an oligonucleotide having Not-I site (GACTAGTTCTAGATCGCGAGCGGCCGCCC(T)₁₅) (Invitrogen) as a primer, mRNAs (Clontech) derived from the human adult whole brain, the human adult hippocampus and the human embryonic whole brain as templates, and a SuperScriptII reverse transcriptase kit (Invitrogen). Next, an adaptor (Invitrogen) having SalI site was ligated to the cDNA, followed by digestion with NotI and 1% low-melt agarose electrophoresis. Thus, DNA fragments of 3 kb or more were purified.

The purified cDNA fragment was ligated to pBluescript IISK+ plasmid pre-treated with SalI-NotI restriction enzymes. The recombinant plasmid was introduced into Escherichia coli strain ElectroMax DH10B (Invitrogen) by electroporation.

(2) Screening

Subsequently, clones were randomly picked up from the constructed cDNA library, and then spotted onto membranes. The mixture of oligo DNAs (each comprising 21 nucleotides) was prepared based on each of the full-length nucleotide sequences of approximately 1300 clones that we had previously analyzed. Each 3′ terminus of the oligo DNAs was labeled with DIG using terminal transferase. Using the DIG-labeled DNAs as probes, dot hybridization (Current Protocols in Molecular Biology, edited by Frederick M. Ausubel et al, 1987) was performed so as to remove redundant clones (clones containing the same sequences).

Next, in vitro transcription and translation (Promega, TNT T7 Quick Coupled Transcription/Translation System cat. No. L1107) were performed, thereby selecting clones for which products of 50 kDa or more had been confirmed.

The terminal nucleotide sequences of the selected clones were then determined. Using the obtained sequences as queries, a homology search program BLASTN 2.0.14 (Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), “Gapped BLAST and PSI-BLAST: a new generation of protein database search programs,” Nucleic Acids Res. 25:3389-3402) was run on nr database (GenBank+EMBL+DDBJ+PDB sequences which do not contain EST, STS, GSS or HTGS (phase 0, 1 or 2) sequences). As a result, the full-length nucleotide sequences of novel genes, for which no homologous gene was present, were determined.

For sequencing, a DNA sequencer (ABI PRISM377) and a reaction kit which are manufactured by PE Applied Biosystems were used. Most sequences were determined by a dye terminator method using shotgun clones. Part of the nucleotide sequences was determined by synthesizing oligonucleotides based on the determined nucleotide sequences, and then performing a primer walking method.

As described above, screening for novel DNAs or genes was performed. As a result, a novel DNA or a gene represented by any one of SEQ ID NOS: 71 to 140 in the sequence listing was detected.

The nucleotide sequences of these novel DNAs or genes were determined by the above sequencing method. Table 1 shows the names of clones having the DNA or the gene of the present invention, the length of a polypeptide encoded by the gene in the clone, its putative function and the grounds for prediction.

TABLE 1


Clone Name and Putative Function

Clone

SEQ	Name/Protein length/
ID	Full length or partial		Grounds for
NO:	sequence	Putative function	prediction

1	fg01864	323	Partial	Involved in mitoses, cell motility	Partially has a region
			sequence	and phagocytosis through the	having 50% homology to
				regulation of the cytoskeleton.	coronin-, actin-binding
				Useful in therapy and diagnosis in	protein 1C.
				the field of regulating the dynamic
				states of cells, such as suppression
				of cancer metastasis and the action
				of immunocytes.
2	fg02011	314	Partial	Regulates gene expression by binding	Partially has a region
			sequence	of C2H2 type zinc finger motif to	having 42% homology to
				DNA, and interaction between	zinc finger protein 91
				Kruppel-associated box (KRAB)	and has zf-C2H2 motifs.
				domain and the other transcriptional
				apparatus. The deletion or the
				mutation of the protein may cause
				abnormalities in morphogenesis or
				cell proliferation. The detection
				of the mutation is useful in
				diagnosing cancer and the
				introduction of the normal gene is
				useful in treating cancer.
3	fg02301	187	Partial	A molecule inferred to be involved in	Partially has a region
			sequence	cell adhesion because it has a	having 39% homology to
				transmembrane domain and three	the immunoglobulin
				Ig-like C2-type domains, and shares	superfamily, and has ig
				high homology with NCAM1 and NCAM2.	motifs and a sosui
				Since the molecule regulates	transmembrane domain.
				intercellular adhesion, it is useful
				in diagnosing and treating
				canceration of cells and cancer
				metastasis, and screening for the
				therapeutic agent.
4	fg02936	1479	Partial	A membrane protein expressed in the	Partially has a region
			sequence	nerve system. Inferred to function	having 99% homology to
				as receptors for semaphorins, the	NOV/plexin-A1 protein
				guidance factor to elongate neural	and has the function
				axial filaments, thus regulating	motifs of each of Sema,
				neuron formation. With its	Plexin_repeat, integrin_—
				function to regulate the growth of	B, and TIG.
				neural axial filaments, it is useful
				in diagnosing and treating a variety
				of neuropsychiatric diseases, or
				screening for the therapeutic agent.
5	fg04068	258	Partial	Encodes a guanine nucleotide	Partially has a region
			sequence	exchange factor (GEF) whose target	having 91% homology to a
				is Rho-type GTPase. Activates Rho	neuronal guanine
				signal by converting Rho to GTP type.	nucleotide exchange
				The expression of the protein is high	factor, and has a PH
				in the brain, suggesting that it is	domain motif.
				involved in brain functions.
				Useful in diagnosing and treating
				cancer, and screening for the
				therapeutic agent. Also inferred
				to be useful in improving brain
				functions, since it is strongly
				expressed in the brain and is thought
				to be involved in recognition
				functions.
6	fg05423	675	Partial	A DNA-binding protein having a zinc	Partially has a region
			sequence	finger motif, and is a	having 51% homology to
				transcriptional regulatory factor.	EVI1 protein, and has
				Inferred to be a protooncogene,	zf-C2H2 motifs and a
				similar to EVI-1, or a causative	zf-BED motif.
				protein of osteomyelodysplasia
				syndrome. Thus, it is thought to be
				useful in diagnosing and treating
				cancer or osteomyelodysplasia, and
				in screening for the therapeutic
				agent.
7	fg06344	248	Partial	Inferred to synthesize acetyl	Partially has a region
			sequence	coenzyme A. The protein may be	having 60% homology to
				useful in screening for anticancer	acetyl CoA synthetase.
				agents and immunosuppressant
				agents.
8	fg06691	193	Partial	Inferred to have high homology with	Partially has a region
			sequence	an enzyme, proline dehydrogenase,	having 93% homology to
				and has functions similar to proline	proline dehydrogenase.
				dehydrogenase. Oxidizes proline to
				1-proline-5-carboxilic acid. The
				deletion of the enzyme causes
				hyperprolinemia. Proline
				regulates transmission by
				glutamate-operated synapses, and
				controls neurotransmission in the
				brain. Elevated blood proline
				levels lead to abnormal sensory
				motor. Thus, it is useful in
				diagnosing and treating mental
				disorders due to hyperprolinemia and
				proline metabolic disorders.
9	fh02216	373	Partial	Has α1,2-mannosidase activity to	Partially has a region
			sequence	remove the terminal mannose of	having 100% homology to
				Man9GlcNac2-, the precursor, formed	α1,2-mannosidase, and
				in ER during the biosynthetic	has a Glyco_hydro_47
				pathways of N-glycoside-binding	motif and a sosui
				sugar chain; and plays an important	transmembrane domain.
				role in sugar chain synthesis of
				N-glycoside-binding glycoprotein.
				The N-glycoside-binding sugar chain
				functions everywhere in vivo. The
				deletion of the protein may cause
				diseases induced by deficient
				N-glycoside binding sugar chain.
				The protein is useful in treating and
				diagnosing these diseases.
10	fh02982	215	Partial	Regulates exocytosis, triggered by	Partially has a region
			sequence	Ca2+, of neurotransmitters in	having 68% homology to
				synapse. Inferred to be useful in	NIM3, and has a C2 domain
				diagnosing and treating nervous	motif.
				diseases.
11	fh03203	1134	Partial	An extracellular matrix	Partially has a region
			sequence	glycoprotein which responds to	having 37% homology to
				pheromone and is transcribed.	hydroxyproline-rich
				Involved in biophylaxis.	glycoprotein DZ-HRGP.
12	fh05673	438	Partial	Expressed upon cephalization to be a	Partially has a region
			sequence	guidance for the growth of neural	having 57% homology to
				axial filaments. Not a type which	netrin-G1c and has a
				acts by diffusion, but acts locally	laminin_Nterm motif and a
				on the surface of a cell membrane.	laminin_EGF motif.
				Useful in diagnosing and treating
				various neuropsychiatric diseases,
				or screening for the therapeutic
				agents for these diseases, since the
				protein regulates the growth of
				neural axial filaments.
13	fh06634	505	Partial	One of the proteins forming an	Partially has a region
			sequence	adaptor molecule complex which	having 100% homology to
				transduces a signal from tyrosine	guanine
				kinase to Ras. Functions as guanine	nucleotide-releasing
				nucleotide-releasing factor 2 for	factor 2, and has a
				Ras. Since the protein is involved	RasGEFN motif and a
				in signal transduction from a	RasGEF motif.
				receptor to Ras, it is useful in
				diagnosing and treating cancer
				through the regulation of cell
				proliferation, and screening for the
				therapeutic agent.
14	fh08795	572	Partial	Promotes GTPase activity of	Partially has a region
			sequence	Ras-related nuclear protein Ran,	having 100% homology to
				which is involved in cell cycle; thus	Ran GTPase activating
				promotes conversion of active	protein 1.
				GTP-Ran to inactive GDP-Ran, thereby
				regulating initiation of cell
				mitosis. Useful in diagnosing and
				treating cancer, and screening for
				the therapeutic agent, since
				abnormalities in the protein can
				cause canceration.
15	fh13310	1051	Partial	Inferred to help the migration of	Partially has a region
			sequence	hnRNPA1 from cytoplasms to nuclei by	having 98% homology to
				binding to hnRNPA1, which is a	karyopherinβ2b,
				protein controlling mRNA processing	transportin, and has
				and the transport of mRNA from nuclei	Armadillo_seg motifs and
				to cytoplasms. Useful in	HEAT motifs.
				diagnosing and treating cancer by
				regulating the gene.
16	fh18356	887	Partial	Inferred to be a factor which is	Partially has a region
			sequence	induced in blastocysts by	having 94% homology to
				parathyroid hormone, and involved in	PTH-responsive
				the activation of blastocysts by	osteosarcome B1 protein.
				parathyroid hormone and
				osteogenesis. Useful in diagnosing
				and treating bone diseases, such as
				osteoporosis and a variety of
				cancers, and screening for the
				therapeutic agent for these
				diseases.
17	fh18358	689	Partial	Promotes the formation of	Partially has a region
			sequence	CDC25c/14-3-3 protein complex by	having 64% homology to
				phosphorylating Ser216 of	Cdc25C associated
				CDC25c; and regulates the initiation	protein kinase C-TAK1,
				of cell mitosis through the complex.	and has a pkinase motif
				Useful in diagnosing and treating	and a UBA motif.
				cancer and screening for the
				therapeutic agent, since it is
				involved in the regulation of cell
				division.
18	fh20539	1004	Partial	Has the ankyrin repeat and is	Partially has a region
			sequence	involved in protein interaction.	having 35% homology to
				Useful in treating cystic fibrosis,	FRANK2 protein, and has
				since it is involved in regulating	ank repeat motifs.
				CFTR expression.
19	fh22167	761	Partial	Serine/threonine kinase in the	Partially has a region
			sequence	intracellular signal transduction	having 30% homology to
				system. Useful in screening a drug	protein kinase WNK1.
				for diseases which involve the
				signal transduction system.
20	fh23421	480	Partial	A nuclear protein involved in mRNA	Has 97% homology to a
			sequence	splicing. Concentrated in portions	putative splicing
				referred to as nuclear TY body, and	factor, YT521-B.
				inferred to provide a site for mRNA
				splicing. Useful in diagnosing and
				treating cancer, and screening for
				the therapeutic agent, since it is
				involved in regulating expression
				and cell proliferation. Also
				useful in the field of regeneration
				medicine.
21	fh24594	762	Partial	Involved in binding synaptosome	Partially has a region
			sequence	binding protein (SNAP-25) to the	having 94% homology to
				cytoskeleton, and regulating	SNAP-25-interacting
				exocytosis. Useful in diagnosing,	protein, and has a
				preventing and treating nervous	Troponin motif.
				diseases, since it is involved in
				regulating the release of
				neurotransmitters.
22	fh26207	1094	Partial	A GnRH-like decapeptide precursor	Partially has a region
			sequence	acting as gonadotropin releasing	having 98% homology to
				hormone. Useful in diagnosing,	putative preoptic
				preventing and treating	regulatory factor-2
				abnormalities in sex hormones, such	precursor.
				as infertility and cancer.
23	fj00154	388	Partial	An enzyme which substitutes	Partially has a region
			sequence	adenosine residue 37 of alanine tRNA	having 99% homology to
				with an inosine residue. Useful in	adenosine deaminase
				preventing, treating and diagnosing	acting on tRNA 1, and has
				diseases involved in modifying, such	A_deamin motifs.
				as tRNA.
24	fj00597s1	523	Full	Inferred to transport iron or other	Partially has a region
			length	divalent cations or to function as a	having 39% homology to
				membrane-binding receptor. For	TTYH1, and has sosui
				example, iron metabolic disorders	transmembrane motifs.
				cause blood diseases, such as
				anemia, the disease of nervous
				degeneration and the like. Thus it
				is useful in diagnosing and treating
				such diseases by detecting and
				regulating the expression and the
				function of the protein.
25	fj03879s1	1653	Partial	Acts on protein interaction since it	Partially has a region
			sequence	has a PH domain. Inferred to act on	having 44% homology to
				the morphological changes in	P116 RHO-interacting
				neurons. Also inferred to inhibit	protein (P116RIP)
				cell expansion and the elongation of	(RIP3), and has a PH
				neurons by acting on Rho. Useful in	domain motif.
				diagnosing and treating nervous
				diseases and cancer, and screening
				for the therapeutic agents for these
				diseases by regulating the gene.
26	fj04226	959	Partial	A microtubule-associated protein	Partially has a region
			sequence	which regulates microtubule	having 61% homology to
				kinetics and interaction between	microtubule-associated
				microtubules and other	protein 4, and has
				intracellular molecules. With the	tubulin-binding motifs.
				strong involvement of a
				microtubule-associated protein in
				cancer and Alzheimer's disease, the
				protein, a putative member of the
				protein family, is useful in
				diagnosing and treating these
				diseases.
27	fj04751	878	Full	A protein which binds to oxysterol,	Partially has a region
			length	and plays an important role in	having 60% homology to
				regulating cholesterol metabolism.	oxysterol-binding
				Useful in diagnosing and treating	protein, and has a PH
				cardiovascular diseases caused by	domain and an
				abnormalities in cholesterol	Oxysterol_Bp motif.
				metabolism, and screening for the
				drug.
28	fj05456	281	Partial	Inferred to act as a cytoskeleton	Partially has a region
			sequence	factor in neurons of the brain.	having 32% homology to a
				This protein has 5 kelch motifs,	ring canal protein, and
				while Gigaxonin (mutated Gigaxonin	has Kelch motifs.
				is found in giant axonal neuropathy)
				has the BTB domain and 6 Kelch
				motifs. The protein is useful in
				diagnosing and treating giant axonal
				neuropathy and degenerative
				disorders in the nervous system
				(e.g., amyotrophic lateral
				sclerosis, amyotrophy,
				charcot-Marie-tooth disease).
29	fj06918	707	Partial	Present in neurons, and co-exists	Partially has a region
			sequence	with ion channels. Involved in	having 46% homology to a
				differentiation of the functional	cell recognition
				domain of axial filaments. Useful	molecule, Caspr2, and has
				in treating, preventing, and	laminin_G motifs and an
				diagnosing nervous diseases, and	EGF motif.
				screening for the therapeutic agent.
30	fj08985	341	Partial	A protein having a motif which binds	Partially has a region
			sequence	to GTP-Rho, and which plays a role in	having 80% homology to
				transducing Rho signal to other	GTP-rho binding protein
				proteins. Involved in the regulation	1, and has a BRO1 motif.
				of the cytoskeleton which is based on
				actin, the contraction of the smooth
				muscle, transcription, cell
				proliferation, and the regulation of
				cell cycle. Useful in diagnosing
				and treating diseases caused by
				abnormalities in the cytoskeleton
				and morphogenesis, and cancer, and
				screening for the therapeutic agent.
31	fj10564	531	Partial	An isozyme of phosphoenzyme which	Partially has a region
			sequence	converts inositol triphosphate to	having 100% homology to
				inositol tetraphosphate.	inositol1, 4,5-
				Regulates intracellular calcium	triphosphate3-kinase
				levels and is involved in signal	(IP3K).
				transduction. Expressed in the
				hippocampus. Useful in screening
				for an agent selectively acting on
				the inositol phosphate pathway.
32	fj11471	1199	Partial	Has domains involved inhistogenesis	Partially has a region
			sequence	and development of extremities.	having 37% homology to
				Possible involvement in cell	FH1/FH2
				localization, cell division and the	domain-containing
				regulation of the cytoskeleton.	protein FHOS, and has a
				High expression of this protein in	FH2 motif.
				the spleen suggests its involvement
				in maturation and development of B
				cells and erythrocytes. Useful in
				diagnosing and treating diseases
				caused by cell or tissue
				development, morphogenesis and
				maturation, and in regeneration
				medicine.
33	fj12188	449	Partial	Regulates the binding and fusion of	Partially has a region
			sequence	synaptic vesicles at the synaptic	having 51% homology to
				termini in the brain. Useful in	serine/threonine-
				treating and diagnosing diseases	protein kinase DCAMKL1,
				with abnormalities in neural	and has a pkinase motif.
				transmission.
34	fj14406	1354	Partial	A motor molecule which converts ATP	Partially has a region
			sequence	(chemical energy) into physical	having 37% homology to 1
				force so as to move along	β dynein heavy chain,
				microtubules. While dyneins	and has a Dynein_heavy
				involved in mitosis, vesicle	motif.
				transport, and the movement of cilia
				and fragella exist as multisubunit
				complexes, the protein functions as
				1β dynein heavy chain which is a
				component of the complex. Useful in
				diagnosing and treating cancer, and
				screening for the therapeutic agent.
				Also useful in diagnosing cranial
				nerve diseases, such as
				hydrencephaly, infertility and
				respiratory apparatus-related
				diseases.
35	fj15278	966	Partial	Involved in regulation of binding	Has 95% homology to
			Sequence	and fusion of synaptic vesicles to	rsec8.
				pre-synaptic membranes. A
				component of a complex involved in
				neural transmission. Useful in
				treating and diagnosing diseases
				with abnormalities in neural
				transmission.
36	fj16085s1	1766	Partial	Regulates cell differentiation and	Partially has a region
			sequence	cell proliferation by interacting	having 57% homology to
				with proteins having the SET domain.	nuclear dual-specificity
				Useful in diagnosing and treating	phosphatase, and has a
				cancer, and screening for the	DENN, a GRAM and a PH
				therapeutic agent.	domain motif.
37	fj17028	498	Partial	Produces phospholipids, the second	Partially has a region
			sequence	messenger, and involved in	having 95% homology to
				intracellular reactions including	phosphatidic
				production of hyperoxides in	acid-preferring
				neutrophils, actin polymerization	phospholipase A1, and has
				and the like. Useful in diagnosing	a DDHD motif.
				and treating infectious diseases,
				inflammation and immune diseases,
				and screening for the drug.
38	fj17066	389	Full	Regulates the expression of homeotic	Partially has a region
			length	genes by modifying the structure of	having 87% homology to
				chromosomes, and inhibits	chromobox homolog 8, and
				(functions to perform silencing)	has a chromo motif.
				gene expression. Useful in
				diagnosing and treating cancer, and
				screening for the therapeutic agent.
				Also useful in gene diagnosis of
				malformation, teratogeny and the
				like, and in the field of
				regeneration medicine.
39	gh01817b	380	Partial	Dissociates a transcribed complex	Partially has a region
			sequence	from a template. Can be used for	having 92% homology to
				analyzing the transcriptional	polymerase I and a
				mechanism.	transcript release
					factor.
40	gh13812	360	Partial	A regulatory subunit of phosphatase	Partially has a region
			sequence	which regulates the activity of a	having 93% homology to
				pyruvate dehydrogenase complex.	pyruvate dehydrogenase
				Useful in diagnosing and treating	phosphatase regulatory
				cancer, and screening for the	subunit precursor, and
				therapeutic agent. Also useful in	has a GCV_T motif.
				screening for an antiobestic drug.
41	hh05136b	832	Partial	A homologue of collagen V precursor.	Partially has a region
			sequence	Collagen V plays an important role in	having 46% homology to
				forming extracellular matrix.	collagen α1 (V) chain
				Useful in diagnosing cirrhosis, and	precursor, and has
				as biological base materials in	Collagen motifs and COLFI
				regeneration medicine.	motifs.
42	hh05356	370	Partial	Forms a spindle during cell	Partially has a region
			sequence	division, and delivers chromosomes	having 97% homology to
				to daughter cells. Involved in	tubulin β-5 chain (β-
				constructing and maintaining the	tubulin class-V).
				three-dimensional structure of a
				cytoplasm together with actin fibers
				and intermediate filaments. Useful
				in diagnosing and treating cancer,
				and screening for the therapeutic
				agent.
43	hh10052	412	Partial	Inferred to be the gene product of a	Partially has a region
			sequence	novel human cartilage link protein	having 49% homology to
				family, which is important in	proteoglycan link
				differentiation and proliferation	protein precursor
				of cartilage cells. Useful in	(cartilage link
				regeneration of the cartilage.	protein), and has an ig
					motif and Xlink motifs.
44	hh13045	803	Partial	Inferred to be novel cadherin	Partially has a region
			sequence	molecules, since they have cadherin	having 30% homology to
				repeats. Involved in cell	FAT tumor suppressor, and
				adhesion. Possible involvement in	has cadherin motifs.
				segregation of cancer cells from
				primary layers and infiltrationwith
				cancer cells. Useful in diagnosing
				and treating cancer, and screening
				for the therapeutic agent. Also
				useful as a marker for renal
				diseases, because of its high
				expression in kidney.
45	hh14180	1036	Full	(Threonin)-0-binding	Has a region having 99%
			length	N-acetylglucosamine transferase,	homology to
				which controls activities of various	N-acetylglucosaminyl
				proteins including a transcription	transferase 110 KDA
				factor, a nuclear membrane protein,	subunit, and has TPR
				a cytoskeletal protein and a	motifs.
				cancer-related protein within the
				nucleus and cytoplasm. Useful in
				diagnosing and treating various
				diseases, such as cancer, and
				screening for the therapeutic agent.
46	hj02562	277	Partial	A protein which may function as a	Has a region having 88%
			sequence	co-activator in RNA polymerase II	homology to PC2 glutamine
				complexes. Possible involvement in	rich binding protein.
				cranial nerve diseases, such as
				Alzheimer's disease and
				Parkinson's disease, because they
				have glutamate repeats. Useful in
				diagnosing cranial nerve diseases,
				such as Alzheimer's disease and
				Parkinson's disease, and as a
				target therapeutic agent to be
				developed.
47	hj03865	1115	Partial	Has 98% homology to a	Partially has a region
			sequence	huntingtin-associated protein	having 98% homology to
				interacting protein (HAPIP) which	huntingtin-associated
				binds to a protein (Duo) binding to	protein interacting
				huntingtin, the cause of	protein, and has a RhoGEF
				Huntington's chorea. High	and a PH domain motif.
				expression in the brain. UNC-73,
				the C. elegans homologue of HAPIP,
				has been shown to involve axonal
				guidance. Suggested to be involved
				in signal transduction, because it
				has the RhoGEF motif. Inferred to
				be useful in diagnosing and treating
				Huntington's chorea. Useful as a
				target gene for developing an agent
				for nerve regeneration because of
				the involvement in axonal guidance.
48	hj05256	783	Partial	Inferred to be a transcription	Partially has a region
			sequence	factor, since the protein has the	having 48% homology to
				zinc finger motif. A deletion or a	zinc finger protein 91,
				mutation in the protein may cause	and has zf-C2H2 motifs.
				abnormalities in morphogenesis and
				cell proliferation. Detection of a
				mutation in the gene is useful in
				diagnosing cancer, and introduction
				of the normal gene is useful in
				treating cancer.
49	hk02174	797	Partial	A protein, which is accumulated in	Partially has a region
			sequence	significant amount in the	having 94% homology to
				post-synaptic density of excitable	proline rich synapse
				synapses. Inferred to be a gene	associated protein 2, and
				encoding a protein which anchors	has a SAM motif.
				SAP90/PSD-95, the scaffold for a
				membrane receptor, to the
				cytoskeleton in synapses using
				glutamate in the central nerve
				system.
				The protein may have influence on the
				generation of the neural network,
				and establishment of memory and
				learning. Useful in diagnosing
				various neuropsychiatric disorders,
				and screening for the therapeutic
				agent.
50	pf00330s1	1043	Full	A protein, the scaffold for ephr in B,	Partially has a region
			length	which plays an important role by	having 87% homology to
				guiding axial filaments in the	glutamate receptor
				embryogenesis, to form a complex	interacting protein 2,
				that transduces signal. Inferred	and has PDZ motifs.
				to involve neural circuit formation.
				Useful in gene diagnosis and the
				field of regeneration medicine.
51	pf00447	421	Partial	It may bind to a protein having the	Partially has a region
			sequence	SH3 domain, because the protein is	having 41% homology to
				homologous to SH3-domain binding	SH3-domain binding
				protein. Since the protein having	protein 5
				the SH3-domain is often involved in	(BTK-associated).
				intracellular signal transduction,
				it can be inferred that the protein
				has similar functions. Useful in
				diagnosing and treating cancer, and
				screening for the therapeutic agent.
52	pg00239	1644	Partial	Inferred to perform protein	Partially has a region
			sequence	interaction, since the protein has	having 30% homology to
				the ankyrin repeat. Possible	ankyrin 3, and has ank
				involvement in signal transduction	repeat motifs.
				and transcriptional control.
				Useful in diagnosing and treating
				cancer, and screening for the
				therapeutic agent.
53	pg00264	534	Partial	Inferred to be sialyltransferase,	Partially has a region
			sequence	and involve post-translational	having 54% homology to
				modification of protein. Useful in	CMP-N-acetylneuraminate-
				treating, preventing and diagnosing	β-galactosamide-α-2,
				cancer, and screening for the drug.	6-siaryltransferase, and
				Also useful in modifying the sugar	has a sosui transmembrane
				chain of a recombinant protein,	motif and a
				similar to a human type.	Glyco_transf_29 motif.
54	pg00933	1768	Partial	A motor molecule which converts ATP	Partially has a region
			sequence	(chemical energy) into physical	having 98% homology to
				force so as to move along	ubiqutinating enzyme
				microtubules. While dyneins	E2-230 kDa.
				involved in mitosis, vesicle
				transport, and the movement of cilia
				and fragella exist as multisubunit
				complexes, the protein functions as
				1β dynein heavy chain which is a
				component of the complex. Useful in
				diagnosing and treating cancer, and
				screening for the therapeutic agent.
				Also useful in diagnosing cranial
				nerve diseases, such as
				hydrencephaly, infertility and
				respiratory apparatus-related
				diseases.
55	ph00331	1313	Partial	Inferred to be ubiquitin-binding	Partially has a region
			sequence	enzyme. It is known that with an	having 73% homology to
				abnormal ubiquitinating process,	dynein heavy chain
				cells are unable to differentiate	isotype 6.
				and proliferate, inducing various
				diseases including cancer and
				Parkinson's disease. Useful in
				screening for the therapeutic agent
				of these diseases.
56	pj01645	765	Partial	Inferred to be a gene involved in	Partially has a region
			sequence	cilia formation. Useful in	having 76% homology to
				diagnosing and treating respiratory	KPL2.
				diseases and cilia dysfunction.
57	pj01649	439	Full	Many microtubule-binding proteins	Partially has a region
			length	are present in neurons, and involve	having 58% homology to
				neural axial filament formation.	putative
				Therefore abnormal	microtubule-associated
				microtubule-binding proteins affect	protein.
				neurogenesis and cause malformation
				and teratogeny. Useful in
				diagnosing and treating cancer, and
				screening for the therapeutic agent.
				Also useful in gene diagnosis of
				congenital diseases and the field of
				nerve regeneration medicine.
58	bf00083	879	Full	A pyruvate dehydrogenase	Has a region having 91%
			length	phosphatase activity regulatory	homology to pyruvate
				subunit. Inferred to involve	dehydrogenase
				regulating sugar metabolism.	phosphatase regulatory
				Useful in diagnosing and treating	subunit precursor, and
				cancer, and screening for the	has a DAO, a Phytoene_dh
				therapeutic agent. Also useful in	and a GCV_T motifs.
				screening for an antiobestic agent.
59	bf00135	699	Partial	Kinesin light chain, the motor	Partially has a region
			sequence	protein which moves along	having 36% homology to
				microtubules. Inferred to involve	kinesin light chain, and
				the intracellular transport of	has TPR motifs.
				substances. Directly binds amyloid
				protein precursor (APP), the
				causative agent of Alzheimer with
				substances, so as to transport the
				substances along neural axial
				filaments in neurons. Useful in
				preventing diseases involved in the
				intracellular transport of
				substances; and diagnosing and
				treating Alzheimer, and screening
				for the therapeutic agent.
60	bg00184	1179	Partial	A novel transcription factor. Many	Partially has a region
			sequence	of them are present as a nuclear	having 99% homology to
				protein in the cerebellum. Useful	TFNR.
				in diagnosing and treating cancer,
				and screening for the therapeutic
				agent.
61	bj00061	802	—	A protein analogous to endozepine,	Partially has a region
				the ligand of the receptor of	having 80% homology to
				benzodiazepine which is classified	endodiazepine-related
				as an antianxiety agent or sedative	protein precursor, and
				drug/hypnotics. Useful as an	has an ACBP motif and a
				analgesic agent, antianxiety agent	sosui transmembrane
				and anticonvulsant in diagnosing,	motif.
				preventing and treating nervous
				diseases.
62	bj00195	1194	Partial	Type 1 hexokinase, which is	Partially has a region
			sequence	transcribed upon spermatogenesis.	having 94% homology to
				The protein is present in the	cytoplasmic dynein heavy
				acrosome of a sperm, and functions as	chain 2.
				a receptor for ZP3 protein, the
				pellucid zone of an egg, upon
				fertilization. Useful in
				discriminating the maturity of
				sperms and suppressing the function
				of sperm. Also useful in diagnosing
				and treating infertility, and
				contraception.
63	fg01285	1560	Partial	A protein analogous to myosin, which	Partially has a region
			sequence	is involved in intracellular	having 35% homology to
				transport and induces dysgenic	myosin XV, and has a
				congenital asymptomatic auditory	myosin_head motif and a
				disorder DFNB3. Useful in	MyTH4 motif.
				preventing, diagnosing and treating
				nervous diseases involving
				intracellular transport. Also
				useful in the filed of medicine of
				nerve regeneration.
64	fh17057	958	Partial	Inferred to be breakpoint cluster	Present on chromosome 14,
			sequence	region protein 2, the product of a	partially has a region
				house keeping gene which encodes a	having 99% homology to
				protein necessary for cell	breakpoint cluster
				survival. Useful in diagnosing	region protein 2, and has
				and treating cancer, and screening	WD40 motifs.
				for the therapeutic agent.
65	ha06731	715	Partial	An analogous protein of HrPOPK-1,	Partially has a region
			sequence	which is inferred to have	having 51% homology to
				serine/threonine kinase activity,	HrPOPK-1, and has a sosui
				and have regulatory functions in	transmembrane motif.
				generation/differentiation, such as
				determination of the embryonic axis.
				Useful in gene diagnosis of
				congenital abnormalities and
				teratogeny, and in the field of
				regeneration medicine. Further,
				useful in diagnosing and treating
				cancer, and screening for the
				therapeutic agent.
66	hj05226	105	—	Homologous to a part of EGF-like	Partially has a region
				domains in a protein (MEGF) having	having 52% homology to
				many EGF-like domains. It is known	MEGF6, and has EGF
				that mutations in the domains affect	motifs.
				cell-to-cell interaction in the
				brain and ligand-receptor
				interaction, so as to cause auxesis
				of the nerve system or
				disorganization of the brain cortex,
				thus induces dementia or the like.
				Useful in diagnosing and treating
				diseases of the brain and the nervous
				system.
67	pf01012	1192	—	Homologous to a part of EGF-like	Partially has a region
				domains in a protein (MEGF) having	having 32% homology to
				many EGF-like domains. It is known	MEGF6, and has EGF
				that mutations in the domains affect	motifs.
				cell-to-cell interaction in the
				brain and ligand-receptor
				interaction, so as to cause auxesis
				of the nerve system or
				disorganization of the brain cortex,
				thus induces dementia or the like.
				Useful in diagnosing and treating
				diseases of the brain and the nervous
				system.
68	fg02852	350	Partial	An analogous protein of p150-Spir	Partially has a region
			sequence	protein which regulates	having 42% homology to
				reconstruction of actin by being	p150-Spir protein.
				phosphorylated with
				stress-responsive phosphoenzyme
				JNK. Useful in diagnosing and
				treating cancer, and screening for
				the therapeutic agent.
69	fh21913a	244	Partial	A protein analogous to fibrillin	Partially has a region
			sequence	which is a major component of a thin	having 74% homology to
				fiber network formed by assembly of	fibrillin 5, and has EGF
				elastin proteins and is present	motifs and a TB motif.
				extensively over the connective
				tissue. With its possible
				involvement in a hereditary disease,
				Marfans syndrome, associated with
				cardiovascular and visual
				disorders, it is useful in the
				diagnosis and the treatment.
70	fj22564	1299	—	A protein having C2H2 type zinc	Partially has a region
				finger motifs. One of intranuclear	having 96% homology to
				proteins expressed in embryonic stem	zinc finger protein and
				cells. Inferred to involve	has zf-C2H2 motifs.
				development, differentiation and
				proliferation. With possible
				involvement in development of early
				embryos, it is inferred to involve
				cell proliferation or
				differentiation. Thus it is useful
				in diagnosing and treating cancer,
				and screening for the therapeutic
				agent. Also it is useful in
				regeneration medicine or gene
				diagnosis of congenital
				abnormalities and teratogeny.

(3) Homology Search for the DNA of the Present Invention

Next, based on the thus obtained full-length nucleotide sequences, the amino acid sequences of the clones were searched on the library of known sequences, nr release 122, using an analysis program BLASTP 2.0.14 (the above-mentioned “Gapped BLAST and PSI-BLAST: a new generation of protein database search programs”). Thus, it was shown that the clones were homologous to each homologous genes listed in Table 2. Table 2 shows the information on these homologous genes, specifically, the name, database ID, biological species, nomenclature, protein length and the literature containing the information.

TABLE 2


Homologous Gene of Each Gene and Biological Species

SEQ	Homologous gene

ID		Database	Biological	Protein
NO:	Name	ID	species^※	length	Literature

1	coronin, actin binding protein	gi\|6753496	Mouse	474	DNA Cell Biol. 17 (9),
	1C				779-787 (1998)
2	zinc finger protein 91	gi\|4508041	Human	1191	Proc. Natl. Acad. Sci.
					U.S.A. 88 (9),
					3608-3612 (1991)
3	immunoglobulin superfamily	gi\|7657226	Human	442	Genomics 62 (2),
					139-146 (1999)
4	NOV/plexin-A1 protein	emb\|CAB57274.1\|	Human	1754	Proc. Natl. Acad. Sci.
					U.S.A. 93 (2), 674-678
					(1996)
5	neuronal guanine nucleotide	gi\|9845277	Mouse	554	Genomics 65 (1), 53-61
	exchange factor				(2000)
6	EVI1 protein	pir\|\|S41705	Human	1042	EMBO J. 13 (3),
					504-510 (1994)
7	acetyl-coenzyme A synthetase	gb\|AAG08119.1\|AE004887_3	Bacillus of green pus	645	Nature 406 (6799),
					959-964 (2000)
8	proline dehydrogenase	gb\|AAD24775.1\|AF120278_1	Human	516	Nat. Genet. 21 (4),
					434-439 (1999)
9	alpha 1,2-mannosidase	gi\|7706437	Human	699	Glycobiology 9 (10),
					1073-1078 (1999)
10	NIM3	gb\|AAF81657.1\|AF199335_1	Rat	285	J. Biol. Chem. 275
					(26), 20033-20044
					(2000)
11	hydroxyproline-rich	emb\|CAB62280.1\|	Volvox	409	J. Biol. Chem. 274
	glycoprotein DZ-HRGP				(49), 35023-35028
					(1999)
12	Netrin-G1c	dbj\|BAB12008.1\|	Mouse	438	J. Neurosci. 20 (17),
					6540-6550 (2000)
13	guanine nucleotide-releasing	gi\|4885357	Human	1077	Proc. Natl. Acad. Sci.
	factor 2				U.S.A. 91 (8),
					3443-3447 (1994)
14	Ran GTPase activating protein 1	gi\|4506411	Human	587	Proc. Natl. Acad. Sci.
					U.S.A. 91 (7),
					2587-2591 (1994)
15	karyopherin beta 2b,	gi\|7305595	Human	887	J. Cell Biol. 138
	transportin				(6), 1181-1192 (1997)
16	PTH-responsive osteosarcoma	gb\|AAD25981.1\|AF095771_1	Human	802	Bone 24 (4), 305-313
	B1 protein				(1999)
17	Cdc25C associated protein	gb\|AAC15093.1\|	Human	729	Cell Growth Differ. 9
	kinase C-TAK1				(3), 197-208 (1998)
18	FRANK2 protein	emb\|CAB96906.1\|	Hawaii's	1596	Genome Res. 10 (8),
			sea urchin		1194-1203 (2000)
19	protein kinase WNK1	gb\|AAF74258.1\|AF227741_1	Rat	2126	J. Biol. Chem. 275
					(22), 16795-16801
					(2000)
20	putative splicing factor	gb\|AAD55973.1\|AF144731_1	Rat	738	Mol. Biol. Cell 10
	YT521-B				(11), 3909-3926
					(1999)
21	SNAP-25-interacting protein	gi\|9507127	Rat	1197	J. Biol. Chem. 275
					(2), 1191-1200 (2000)
22	PUTATIVE PREOPTIC REGULATORY	sp\|P18890\|PRF2_—	Rat	75	Mol. Endocrinol. 4
	FACTOR-2 PRECURSOR				(8), 1205-1210 (1990)
23	adenosine deaminase acting on	gi\|6912230	Human	502	Proc. Natl. Acad. Sci.
	tRNA 1				U.S.A. 96 (16),
					8895-8900 (1999)
24	TTYH1	gb\|AAG02580.1\|AF177909_1	Human	450	Genomics 68 (1),
					89-92 (2000)
25	P116 RHO-INTERACTING PROTEIN	sp\|P97434\|RIP3	Mouse	1024	J. Cell Biol. 137 (7),
	(P116RIP) (RIP3)				1603-1613 (1997)
26	microtubule-associated	gi\|4505099	Human	1152	Cell Motil.
	protein 4				Cytoskeleton 23 (4),
					236-243 (1992)
27	OXYSTEROL-BINDING PROTEIN	sp\|P16258\|OXYB	Rabbit	809	J. Biol. Chem. 264
					(28), 16798-16803
					(1989)
28	ring canal protein	gb\|AAA53472.2\|	Fruit fly	1477	Cell 72 (5), 681-693
					(1993)
29	cell recognition molecule	gi\|7662350	Human	1331	Neuron 24 (4),
	Caspr2				1037-1047 (1999)
30	GTP-rho binding protein 1	gi\|6680085	Mouse	643	Science 271 (5249),
					645-648 (1996)
31	INOSITOL 1,4,5-TRISPHOSPHATE	sp\|P27987\|IP3L	Human	505	Biochem. J. 278 (Pt
	3-KINASE (IP3K)				3), 883-886 (1991)
32	FH1/FH2 domain-containing	gi\|7019375	Human	1164	Gene 232 (2), 173-182
	protein FHOS				(1999)
33	SERINE/THREONINE-PROTEIN	sp\|008875\|DCK1	Rat	433	J. Mol. Neurosci. 10
	KINASE DCAMKL1				(2), 75-98 (1998)
34	1 beta dynein heavy chain	emb\|CAB99316.1\|	Chlamydomonas	4513	Mol. Biol. Cell 11
					(7), 2297-2313 (2000)
35	rsec8	pir\|\|I59422	Rat	975	Proc. Natl. Acad. Sci.
					U.S.A. 92 (21),
					9613-9617 (1995)
36	nuclear dual-specificity	gb\|AAC39675.1\|	Human	1697	Nature Genet. 18 (4),
	phosphatase				331-337 (1998)
37	phosphatidic acid-preferring	gb\|AAC03019.1\|	Bovine	875	J. Biol. Chem. 273
	phospholipase A1				(1998) 5468-5477
38	chromobox homolog 8	gi\|7304947	Mouse	362	Gene 242 (1-2), 31-40
					(2000)
39	polymerase I and transcript	gi\|6679567	Mouse	392	EMBO J. 17 (10),
	release factor				2855-2864 (1998)
40	pyruvate dehydrogenase	gb\|AAC48785.1\|	Bovine	878	J. Biol. Chem. 272
	phosphatase regulatory				(50), 31625-31629
	subunit precursor				(1997)
41	collagen alpha 1(V) chain	pir\|\|CGHU1V	Human	1838	J. Biol. Chem. 261
	precursor				(11), 5034-5040
					(1986)
42	TUBULIN BETA-5 CHAIN	sp\|P09653\|TBB5_—	Gallus	446	Mol. Cell. Biol. 6
	(BETA-TUBULIN CLASS-V)	CHICK			(12), 4409-4418
					(1986)
43	PROTEOGLYCAN LINK PROTEIN	sp\|P07354\|PLK_CHICK	Gallus	355	Proc. Natl. Acad. Sci.
	PRECURSOR (CARTILAGE LINK				U.S.A. 83 (11),
	PROTEIN)				3766-3770 (1986)
44	FAT tumor suppressor	gi\|4885229	Human	4590	Genomics 30 (2),
					207-223 (1995)
45	N-ACETYLGLUCOSAMINYLTRANSFERASE	sp\|P56558\|OGT1	Rat	1036	J. Biol. Chem. 272
	110 KDA SUBUNIT				(14), 9308-9315
					(1997)
46	OPA-containing protein 1	gb\|AAC83164.1\|	Mouse	2074	Mol. Psych. 3 (4),
					303-309 (1998)
47	huntingtin-associated protein	gi\|4504335	Human	1663	Hum. Mol. Genet. 6
	interacting protein				(9), 1519-1525 (1997)
48	zinc finger protein 91	gi\|4508041	Human	1191	Proc. Natl. Acad. Sci.
					U.S.A. 88 (9),
					3608-3612 (1991)
49	Proline rich synapse	emb\|CAB45688.1\|	Rat	1806	Biochem. Biophys.
	associated protein 2				Res. Commun. 264,
					2476-2528 (1999)
50	glutamate receptor	gb\|AAD25916.1\|AF072509_1	Rat	1043	Neuron 22, 511-524
	interacting protein 2				(1999)
51	SH3-domain binding protein 5	gi\|4759058	Human	425	Biochem. Biophys.
	(BTK-associated)				Res. Commun. 245 (2),
					337-343 (1998)
52	ankyrin 3	gb\|AAB01607.1\|	Mouse	1961	J. Cell Biol. 130
					(2), 313-330 (1995)
53	CMP-N-ACETYLNEURAMINATE-BETA-	sp\|Q92182\|CAG1_—	Gallus	413	Eur. J. Biochem. 219
	GALACTOSAMIDE-ALPHA-2,	CHICK			(1-2), 375-381 (1994)
	6-SIALYLTRANSFERASE
54	dynein heavy chain isotype 6	pir\|\|T30298	Globe fish	1125	Mol. Biol. Cell 5 (1),
					57-70 (1994)
55	ubiquitinating enzyme E2-230 kDa	pir\|\|I49264	Mouse	299	Proc. Natl. Acad.
					Sci. U.S.A. 92 (11),
					4982-4986 (1995)
56	KPL2	gb\|AAD56310.1\|AF102129_1	Rat	1744	Am. J. Respir. Cell
					Mol. Biol. 20 (4),
					675-683 (1999)
57	putative microtubule	gb\|AAC79958.1\|	Gallus	369	J. Med. Dent. Sci. 45,
	associated protein				123-133 (1998)
58	pyruvate dehydrogenase	gb\|AAC48785.1\|	Bovine	878	J. Biol. Chem. 272
	phosphatase regulatory				(50), 31625-31629
	subunit precursor				(1997)
59	kinesin light chain	gb\|AAB87735.1\|	Plectonema	490	DNA Cell Biol. 16 (6),
					787-795 (1997)
60	TFNR	emb\|CAC21448.1\|	Human	2187	Genomics 70, 315-326
					(2000)
61	ENDOZEPINE-RELATED PROTEIN	sp\|P07106\|ENDR	Bovine	533	DNA 6 (1), 71-79
	PRECURSOR				(1987)
62	cytoplasmic dynein heavy chain 2	gi\|12711694	Rat	4306	Mol. Biol. Cell 9, 276
					(1998)
63	Myosin XV	gi\|6754780	Mouse	3511	Genomics 61 (3),
					243-258 (1999)
64	breakpoint cluster region	gb\|AAC08965.1\|	Human	510	Genomics 52 (1), 17-26
	protein 2				(1998)
65	HrPOPK-1	dbj\|BAA28663.1\|	Ascidian	698	Mech. Dev. 76 (1-2),
					161-163 (1998)
66	MEGF6	gi\|12621134	Rat	1574	Genomics 51 (1), 27-34
					(1998)
67	MEGF6	gi\|12621134	Rat	1574	Genomics 51 (1), 27-34
					(1998)
68	p150-Spir protein	emb\|CAB62901.1\|	Fruit fly	1020	Curr. Biol. 10 (6),
					345-348 (2000)
69	fibrillin 5	emb\|CAB56757.1\|	Human	754	Nature 352 (6333),
					330-334 (1991)
70	zinc finger protein	pir\|\|B38203	Mouse	191	Genes Dev. 6 (6),
					903-918 (1992)

^※Nomenclature of each biological species is as follows: mouse = Mus musculus; human = Homo sapiens; bacillus of green pus = Pseudomonas aeruginosa; rat = Rattus norvegicus; volvox = Volvox carteri f. nagariensis; Hawaii's sea urchin = Tripneustes gratilla; rabbit = Oryctolagus cuniculus; fruit fly = Drosophila melanogaster; chlamydomonas = Chlamydomonas reinhardtii;
# bovine = Bos taurus; gallus = Gallus gallus; globe fish = Takifugu rubripes; plectonema = Plectonema boryanum; ascidian = Halocynthia roretzi.

Table 3 summarizes a variety of data concerning homology between the DNA or the gene of the present invention contained in each clone and each homologous gene listed in Table 2. The meaning of each item in Table 3 is as follows:

Score: the higher the value, the higher the reliability
E-value: the closer this value to 0, the higher the reliability
Starting point: an amino acid position as a starting point of the homologous region
End point: an amino acid position as an end point of the homologous region
Homology: the proportion (degree) of amino acid residues that are identical in a homologous region.

Homologous region %: the proportion (%) of a homologous region in a homologous gene.

TABLE 3


Homology between each gene and homologous gene

Homologous region

		Homologous
SEQ	clone	gene	Homology value

ID	Starting	End	Starting	End				Homologous
NO:	point	point	point	point	Score	E-value	Homology	region %

1	20	318	211	472	295	4e−79	50%	(155/310)	55%
2	1	183	975	1166	156	3e−37	42%	(81/192)	16%
3	20	187	259	442	122	3e−27	39%	(74/187)	42%
4	1	1324	278	1601	2708	0	99%	(1321/1324)	75%
5	71	217	322	468	266	2e−70	91%	(135/147)	27%
6	4	653	294	923	599	e−170	51%	(343/661)	60%
7	1	235	303	534	325	3e−88	60%	(142/235)	36%
8	51	193	283	425	275	3e−73	93%	(134/143)	28%
9	92	373	418	699	591	e−168	100%	(282/282)	40%
10	6	215	76	285	303	9e−82	68%	(143/210)	74%
11	875	1130	27	277	144	6e−33	37%	(96/256)	61%
12	36	380	28	372	444	e−124	57%	(199/346)	79%
13	47	505	619	1077	928	0	100%	(459/459)	43%
14	37	530	1	494	963	0	100%	(494/494)	84%
15	165	1051	1	887	1794	0	98%	(874/887)	100%
16	62	856	1	755	1499	0	94%	(753/795)	94%
17	2	669	1	655	816	0	64%	(435/674)	90%
18	2	998	515	1592	593	e−168	35%	(398/1127)	68%
19	187	748	739	1328	189	9e−47	30%	(188/625)	28%
20	1	480	251	738	1001	0	97%	(476/488)	66%
21	1	718	457	1173	1364	0	94%	(683/719)	60%
22	1020	1094	1	75	152	1e−35	98%	(74/75)	100%
23	26	388	97	459	738	0	99%	(360/363)	72%
24	6	413	7	417	306	3e−82	39%	(163/411)	91%
25	1063	1644	383	975	452	e−125	44%	(273/611)	58%
26	140	913	332	1089	835	0	61%	(491/793)	66%
27	54	878	20	809	965	0	60%	(498/830)	98%
28	1	278	408	684	136	3e−31	32%	(91/284)	19%
29	1	705	597	1329	663	0	46%	(343/736)	55%
30	94	259	160	325	267	1e−70	80%	(133/166)	26%
31	319	519	3	203	419	e−116	100%	(201/201)	40%
32	2	361	216	569	182	2e−44	37%	(142/379)	30%
33	308	445	75	212	152	7e−36	51%	(71/138)	32%
34	1	1353	3205	4512	966	0	37%	(515/1363)	29%
35	1	966	9	975	1816	0	95%	(921/967)	99%
36	110	1762	5	1630	1828	0	57%	(957/1674)	96%
37	4	498	381	875	979	0	95%	(475/495)	57%
38	1	389	1	362	667	0	87%	(341/389)	100%
39	1	380	11	392	682	0	92%	(355/382)	97%
40	1	360	519	878	715	0	93%	(336/360)	41%
41	2	592	634	1274	545	e−154	46%	(304/647)	35%
42	14	369	90	445	720	0	97%	(346/356)	80%
43	36	376	19	353	361	7e−99	49%	(171/345)	94%
44	3	615	2979	3576	255	1e−66	30%	(189/618)	13%
45	1	1036	1	1036	2117	0	99%	(1030/1036)	100%
46	129	263	1911	2052	101	7e−21	50%	(72/144)	7%
47	53	1115	588	1663	2132	0	98%	(1063/1076)	65%
48	66	760	425	1157	744	0	48%	(365/758)	62%
49	1	797	1010	1806	1504	0	94%	(754/799)	44%
50	4	1043	11	1043	1814	0	87%	(912/1040)	99%
51	80	365	7	269	213	2e−54	41%	(120/287)	62%
52	658	1057	213	611	149	2e−34	30%	(126/413)	20%
53	211	524	100	413	371	e−102	54%	(172/315)	76%
54	717	1768	1	1047	1570	0	73%	(774/1060)	93%
55	129	368	1	240	497	e−139	98%	(236/240)	80%
56	6	735	965	1715	1195	0	76%	(579/753)	43%
57	1	439	1	369	472	e−132	58%	(258/440)	100%
58	1	879	1	878	1680	0	91%	(804/878)	100%
59	264	557	179	446	153	5e−36	36%	(108/294)	55%
60	1	742	1446	2187	1463	0	99%	(741/742)	34%
61	6	437	93	533	722	0	80%	(357/443)	83%
62	1	1194	3113	4306	2274	0	94%	(1126/1194)	28%
63	146	965	1653	2499	494	e−138	35%	(319/888)	24%
64	278	787	1	510	1048	0	99%	(508/510)	100%
65	1	650	44	647	611	e−174	51%	(346/678)	87%
66	4	94	330	419	117	5e−26	52%	(48/91)	6%
67	74	892	32	1002	499	e−139	32%	(315/982)	62%
68	62	318	213	456	189	6e−47	42%	(109/259)	24%
69	1	242	163	405	435	e−121	74%	(181/243)	32%
70	798	988	1	191	389	e−106	96%	(185/191)	100%

(4) Search for Each Domain

Using as queries the amino acid sequence encoded by DNAs contained in the clones, functional domains were searched with a search tool contained in Pfam 6.0 (Pfam HMM ver. 2.1 Search (HMMPFAM), Sonnhammer, E. L. L., Eddy, S. R., Birney, E., Bateman, A., and Durbin, R. (1998) “Pfam: multiple sequence alignments and HMM-profiles of protein domains” Nucleic Acids Res. 26:320-322).

Further, transmembrane domains were searched with a prediction program for membrane proteins, the SOSUI system (ver. 1.0/10, March, 1996) (Takatsugu Hirokawa, Seah Chieng and Shigeki Mitaku, SOSUI: Classification and Secondary Structure Prediction System for Membrane Proteins), Bioinformatics (formerly CABIOS) 1998 May; 14(4):378-379).

Table 4 shows the detected functional domains and transmembrane domains for each clone.

The meaning of each item in Table 4 is as follows:

Functional domain: a domain detected by Pfam or SOSUI
Starting point: an amino acid position as a starting point of a functional domain
End point: an amino acid position as an end point of a functional domain.
Score (Pfam only): the higher the value, the higher the reliability
Exp (Pfam only): the closer the value to 0, the higher the reliability

Table 5 shows the complete notation of each functional domain.

TABLE 4


Functional domain

SEQ

Clone

Homologous gene

ID	Functional	Starting	End			Functional	Starting	End
NO:	domain	point	point	Score	Exp	domain	point	point	Score	Exp

1						WD40	72	109	33.1	6.2e−06
						WD40	122	159	28.6	0.00015
						WD40	166	202	21.4	0.022
2	zf-C2H2	45	67	31.2	2.4e−05	KRAB	13	75	159.5	5.6e−44
	zf-C2H2	73	95	37.1	3.9e−07	zf-C2H2	182	200	−1.9	1.3e+02
	zf-C2H2	123	145	23.2	0.0063	zf-C2H2	210	232	21.8	0.017
	zf-C2H2	151	173	31.3	2.2e−05	zf-C2H2	238	260	33.1	6.3e−06
	zf-C2H2	179	201	27.7	0.00026	zf-C2H2	266	288	34	3.4e−06
	zf-C2H2	261	283	31.3	2.3e−05	zf-C2H2	294	316	37.9	2.3e−07
	zf-C2H2	289	311	38.4	1.6e−07	zf-C2H2	322	344	37.9	2.2e−07
						zf-C2H2	350	372	36.1	7.8e−07
						zf-C2H2	378	400	34.9	1.8e−06
						zf-C2H2	406	428	35.3	1.4e−06
						zf-C2H2	434	456	35.3	1.4e−06
						zf-C2H2	462	484	33.8	3.9e−06
						zf-C2H2	490	512	37.1	4e−07
						zf-C2H2	518	540	15.7	1.1
						zf-C2H2	546	568	32.4	1.1e−05
						zf-C2H2	574	596	34	3.4e−06
						zf-C2H2	602	624	34.2	3.1e−06
						zf-C2H2	630	652	37.9	2.2e−07
						zf-C2H2	658	680	36.1	7.8e−07
						zf-C2H2	686	708	34.9	1.8e−06
						zf-C2H2	714	736	35.3	1.4e−06
						zf-C2H2	742	764	36.8	5e−07
						zf-C2H2	770	792	35	1.7e−06
						zf-C2H2	798	820	37.3	3.4e−07
						zf-C2H2	826	848	34.8	2e−06
						zf-C2H2	854	876	37	4.2e−07
						zf-C2H2	885	904	11.3	6.3
						zf-C2H2	910	932	38.2	1.9e−07
						zf-C2H2	938	960	36.5	5.9e−07
						zf-C2H2	966	988	34.3	2.8e−06
						zf-C2H2	994	1016	39	1.1e−07
						zf-C2H2	1022	1044	35.1	1.6e−06
						zf-C2H2	1050	1072	33.8	3.8e−06
						zf-C2H2	1078	1100	39	1.1e−07
						zf-C2H2	1106	1128	32.9	7.5e−06
						zf-C2H2	1134	1156	19.5	0.081
3	ig	21	87	16.9	0.0012	sosui	16	38	—	—
	sosui	120	142	—	—	ig	57	126	23.3	1.1e−05
						ig	159	222	9.6	0.21
						ig	260	315	36.6	8.6e−10
						sosui	374	396	—	—
4	Sema	14	196	78.3	2.9e−20	Sema	29	473	198.8	8.3e−56
	Plexin_repeat	215	265	58.1	1.9e−13	Plexin_repeat	492	542	58.1	1.9e−13
	integrin_B	221	237	5.1	0.18	integrin_B	498	514	5.1	0.18
	Plexin_repeat	361	408	59.8	5.7e−14	Plexin_repeat	638	685	59.8	5.7e−14
	Plexin_repeat	509	563	50.7	3.3e−11	Plexin_repeat	786	840	50.7	3.3e−11
	integrin_B	516	535	9.4	0.006	integrin_B	793	812	9.4	0.006
	TIG	565	660	75.5	1.1e−18	TIG	842	937	75.5	1.1e−18
	TIG	662	746	86.3	6.3e−22	TIG	939	1023	86.3	6.3e−22
	TIG	749	848	71.3	2e−17	TIG	1026	1125	71.3	2e−17
	TIG	851	937	42.9	7.2e−09	TIG	1128	1214	42.9	7.2e−09
	sosui	943	965	—	—
5	PH	83	194	43.6	7.7e−11	RhoGEF	121	297	71.2	2.2e−17
						PH	334	445	42.8	1.3e−10
						SH3	459	515	47.3	3.3e−10
6	zf-C2H2	449	471	37.3	3.4e−07	zf-C2H2	21	44	26.3	0.00074
	zf-BED	462	501	11.3	0.073	zf-C2H2	75	97	26.5	0.00061
	zf-C2H2	477	500	33.5	4.8e−06	zf-C2H2	103	125	35.1	1.6e−06
	zf-C2H2	506	528	26.5	0.00061	zf-BED	116	155	−0.3	1.5
						zf-C2H2	131	154	37.2	3.8e−07
						zf-C2H2	160	182	32	1.4e−05
						zf-C2H2	188	210	26.5	0.00064
						zf-C2H2	217	239	29.5	7.8e−05
						zf-C2H2	724	746	37.3	3.4e−07
						zf-BED	737	776	11.3	0.073
						zf-C2H2	752	775	33.5	4.8e−06
						zf-C2H2	781	803	27.5	0.0003
7						AMP-binding	108	544	446.8	1.8e−130
8						Pro_dh	143	498	582.7	2.4e−171
9	Glyco_hydro_47	2	369	205	1.2e−57	sosui	83	105	—	—
	sosui	17	39	—	—	Glyco_hydro_47	256	695	696	1.9e−205
10	C2	76	163	45.6	1.1e−09	C2	146	233	61.9	1.4e−14
12	BNR	44	55	9.4	34	sosui	1	23	—	—
	laminin_Nterm	58	304	30	5.4e−12	laminin_Nterm	50	295	37.3	1.6e−12
	BNR	171	182	11.7	16	laminin_EGF	297	341	33.3	5.4e−06
	laminin_EGF	306	363	35.2	1.5e−06	sosui	419	438	—	—
13	RasGEFN	114	170	36.7	5.3e−07	RasGEFN	686	742	36.7	5.3e−07
	RasGEF	265	442	206.6	3.8e−58	RasGEF	837	1014	206.6	3.8e−58
15	Armadillo_seg	281	319	13.4	2.6	Armadillo_seg	117	155	13.4	2.6
	Armadillo_seg	364	406	0.4	84	Armadillo_seg	200	242	0.4	84
	HEAT	373	408	10.9	9	HEAT	209	244	10.9	9
	HEAT	551	589	10.3	11	HEAT	387	425	10.3	11
	Armadillo_seg	590	628	19.8	0.064	Armadillo_seg	426	464	19.8	0.064
	HEAT	592	630	1.4	1.1e+02	HEAT	428	466	1.4	1.1e+02
	HEAT	718	758	1.7	1e+02	HEAT	554	594	1.7	1e+02
	Armadillo_seg	819	857	1.2	68	HEAT	657	695	11.7	7.3
	HEAT	821	859	11.6	7.6	Armadillo_seg	695	734	17.3	0.37
	Armadillo_seg	859	898	11.4	4.5
17	pkinase	60	311	351.3	1.1e−101	pkinase	56	307	345.6	5.3e−100
	UBA	331	370	30.8	3.1e−05	UBA	327	366	31.1	2.6e−05
18	ank	50	83	6.3	36	ank	563	594	4.6	58
	ank	84	116	42.3	1.1e−08	ank	627	659	18.1	0.2
	ank	117	149	39.4	8.3e−08	ank	660	692	42.8	7.6e−09
	ank	150	182	40.1	5e−08	ank	693	725	39.8	6e−08
	ank	183	217	18.5	0.16	ank	726	760	25.9	0.00094
	ank	253	286	23.4	0.0053
19						Pkinase	221	479	215.7	7e−61
21	Troponin	230	378	−18.4	0.97	Troponin	685	833	−17.8	0.87
22	WW	37	67	4.5	2.3	Sosui	1	23	—	—
	WW	76	106	23.4	0.0054
	MyTH4	772	890	78.5	1.4e−19
	RhoGAP	920	1067	76.1	7.3e−19
23	A_deamin	27	76	17.7	0.0003	A_deamin	63	147	86.3	3.9e−24
	A_deamin	220	295	76.3	3.2e−21	A_deamin	431	497	16.1	0.00084
24	sosui	1	19	—	—	sosui	46	68	—	—
	sosui	43	65	—	—	sosui	87	108	—	—
	sosui	83	104	—	—	sosui	214	236	—	—
	sosui	175	197	—	—	sosui	247	269	—	—
	sosui	209	231	—	—	sosui	390	412	—	—
	sosui	240	262	—	—
	sosui	388	410	—	—
25	PH	1067	1175	77.3	2.6e−20	PH	44	145	49.8	1.4e−12
						PH	387	482	75.2	9.6e−20
26	tubulin-binding	747	777	54.8	1.1e−14	tubulin-binding	923	953	54.8	1.1e−14
	tubulin-binding	816	846	61.8	9.4e−17	tubulin-binding	992	1022	61.8	9.4e−17
	tubulin-binding	847	877	61	1.6e−16	tubulin-binding	1023	1053	61	1.6e−16
	tubulin-binding	878	909	36.5	3.2e−09	tubulin-binding	1054	1085	36.5	3.2e−09
27	PH	145	236	78.7	1e−20	PH	91	183	87.8	2.8e−23
	Oxysterol_BP	446	868	506.2	2.4e−148	Oxysterol_BP	383	799	770.3	7.5e−228
28	Kelch	32	79	30	5.5e−05	BTB	141	253	127.6	2.3e−34
	Kelch	81	126	43.2	6e−09	Kelch	392	436	20.4	0.042
	Kelch	128	176	30.1	5.1e−05	Kelch	438	483	44.6	2.3e−09
	Kelch	178	218	42.8	7.5e−09	Kelch	485	530	52.7	8.3e−12
	Kelch	220	267	48.3	1.7e−10	Kelch	532	579	49.9	5.5e−11
						Kelch	581	626	49	1.1e−10
						Kelch	628	673	48.9	1.1e−10
29	laminin_G	220	342	76.1	1.6e−20	sosui	7	29	—	—
	EGF	361	395	19.3	0.089	F5_F8_type_C	38	178	216.5	4e−61
	TSPN	380	577	−42.4	0.71	laminin_G	216	348	62.2	1.5e−16
	laminin_G	472	530	8.4	0.37	laminin_G	401	532	48.5	1.3e−12
	sosui	639	661	—	—	EGF	558	590	30.3	4.4e−05
						laminin_G	827	948	53.8	3.8e−14
						EGF	967	1001	18.2	0.19
						Laminin_G	1055	1185	15.6	0.0032
30	BR01	66	201	123.2	4.7e−33	HR1	42	114	89.2	8.4e−23
						BR01	115	267	276.1	4.6e−79
						PDZ	500	577	45	1.6e−09
32	FH2	661	1157	178.1	1.5e−49	FH2	594	1069	187.1	2.7e−52
33	pkinase	316	445	170.2	3.4e−47	pkinase	83	340	326.3	3.5e−94
34	Dynein_heavy	597	1351	920	6.8e−273	Dynein_heavy	3804	4510	910.4	5.1e−270
36	DENN	44	183	111.1	7e−30	DENN	5	78	70	2.2e−18
	GRAM	756	842	46.6	3.8e−12	GRAM	650	736	62.4	9.1e−17
	PH	1661	1764	68	1e−17	PH	1529	1632	63.6	1.8e−16
37	DDHD	237	484	401.4	8.6e−117	DDHD	614	861	431.3	8.9e−126
38	chromo	8	48	67.7	2.1e−18	chromo	8	48	67.7	2.1e−18
40	GCV_T	4	344	556.6	1.6e−163	DAO	42	403	−70.5	0.0014
						Phytoene_dh	44	405	−327.7	0.14
						UPF0079	52	149	−47.5	0.26
						GCV_T	522	862	593.6	1.2e−174
41	Collagen	5	64	46.8	4.9e−10	TSPN	39	230	289.8	3.3e−83
	Collagen	65	124	50.1	5.1e−11	Collagen	469	528	23.8	0.00018
	Collagen	125	184	52.1	1.2e−11	Collagen	554	612	27.7	0.00011
	Collagen	188	247	48.8	1.2e−10	Collagen	613	672	61.8	1.5e−14
	Collagen	251	310	53.8	3.7e−12	Collagen	673	732	60.8	3e−14
	Collagen	312	371	58.4	1.5e−13	Collagen	733	792	42.8	7.5e−09
	Collagen	384	443	48.5	1.5e−10	Collagen	793	852	37.7	2.7e−07
	Collagen	450	509	50.8	3e−11	Collagen	853	912	43.9	3.6e−09
	Collagen	534	593	44.8	1.9e−09	Collagen	913	972	59.9	5.4e−14
	COLFI	648	706	92.7	1e−35	Collagen	985	1044	54.4	2.5e−12
	COLFI	715	831	56.7	1.1e−21	Collagen	1045	1104	55.2	1.5e−12
						Collagen	1105	1164	55.7	9.9e−13
						Collagen	1165	1224	49.5	7.2e−11
						Collagen	1225	1284	46.7	5.3e−10
						Collagen	1285	1344	48.9	1.1e−10
						Collagen	1345	1404	43.5	4.7e−09
						Collagen	1405	1464	46.4	6.3e−10
						Collagen	1465	1524	55.9	9.1e−13
						Collagen	1525	1584	18.9	0.00032
						COLFI	1625	1836	484.1	1.1e−188
42	tubulin	14	347	737	8e−218
43	sosui	16	36	—	—	ig	54	142	25.8	2e−06
	ig	71	155	23.8	8e−06	Xlink	159	254	220.3	7.1e−95
	Xlink	172	277	162	1.2e−69	Xlink	260	351	196	2.3e−84
	Xlink	283	374	132	1.3e−56
44	cadherin	56	143	86.5	5.3e−22	sosui	4	26	—	—
	cadherin	157	253	81.7	1.5e−20	cadherin	39	140	14.8	0.04
	cadherin	267	360	69.7	6.2e−17	cadherin	154	248	70.4	3.8e−17
	cadherin	374	470	50.8	3.1e−11	cadherin	372	454	20.6	0.013
	cadherin	486	577	94.6	1.9e−24	cadherin	468	560	69.2	8.8e−17
	cadherin	591	682	72.4	9.5e−18	cadherin	574	664	24.7	0.0022
						cadherin	722	813	68.8	1.1e−16
						cadherin	827	918	98.3	1.5e−25
						cadherin	932	1023	96.9	3.9e−25
						cadherin	1039	1130	83.9	3.3e−21
						cadherin	1144	1236	105.7	8.8e−28
						cadherin	1250	1346	41	2.7e−08
						cadherin	1363	1447	46.3	6.6e−10
						cadherin	1461	1553	53.5	4.7e−12
						cadherin	1567	1661	89.5	6.6e−23
						cadherin	1675	1759	69.7	6e−17
						cadherin	1773	1871	47.7	2.7e−10
						cadherin	1887	1973	36.7	5.4e−07
						cadherin	1987	2073	29.5	7.9e−05
						cadherin	2089	2178	43.9	3.6e−09
						cadherin	2190	2277	53.8	3.9e−12
						cadherin	2291	2384	96.1	7e−25
						cadherin	2398	2486	44.1	3.1e−09
						cadherin	2500	2590	48.6	1.4e−10
						cadherin	2604	2696	47.1	3.9e−10
						cadherin	2710	2802	35.4	1.3e−06
						cadherin	2816	2911	81.5	1.8e−20
						cadherin	2925	3016	68.3	1.7e−16
						cadherin	3030	3118	94.1	2.8e−24
						cadherin	3132	3223	99.9	5e−26
						cadherin	3237	3328	104.8	1.7e−27
						cadherin	3342	3433	106.6	4.8e−28
						cadherin	3447	3538	47.2	3.7e−10
						cadherin	3553	3634	11.5	0.077
						EGF	3796	3828	19.5	0.077
						laminin_G	3861	3990	75.5	2.3e−20
						EGF	4019	4051	36.8	4.9e−07
						EGF	4058	4089	31.7	1.7e−05
						EGF	4095	4126	35.6	1.1e−06
						EGF	4133	4164	32.9	7.2e−06
45	TPR	11	44	1.6	31	TPR	11	44	1.6	31
	TPR	79	112	50.7	3.3e−11	TPR	79	112	50.7	3.3e−11
	TPR	113	146	26.3	0.00073	TPR	113	146	26.3	0.00073
	TPR	147	180	31	2.8e−05	TPR	147	180	31	2.8e−05
	TPR	181	214	39.8	6.1e−08	TPR	181	214	39.8	6.1e−08
	TPR	215	248	31.9	1.5e−05	TPR	215	248	31.9	1.5e−05
	TPR	249	282	38.8	1.3e−07	TPR	249	282	38.8	1.3e−07
	TPR	283	316	39.5	7.4e−08	TPR	283	316	39.5	7.4e−08
	TPR	317	350	40.1	5e−08	TPR	317	350	40.1	5e−08
	TPR	351	384	39.7	6.4e−08	TPR	351	384	39.7	6.4e−08
	TPR	385	418	41.4	2e−08	TPR	385	418	41.4	2e−08
	TPR	419	452	38.2	1.9e−07	TPR	419	452	38.2	1.9e−07
46						sosui	179	200	—	—
						sosui	266	288	—	—
47	RhoGEF	737	907	120.3	3.6e−32	spectrin	188	235	12	0.074
	PH	921	1032	45.6	2.1e−11	spectrin	279	308	9.6	0.34
						spectrin	310	416	23.7	4.2e−05
						spectrin	536	642	20.1	0.00043
						spectrin	803	877	−9.2	3.5e+04
						spectrin	890	937	7	1.8
						spectrin	958	1004	7.8	1.1
						spectrin	1130	1222	17.6	0.0021
						RhoGEF	1285	1455	120.3	3.6e−32
						PH	1469	1580	45.6	2.1e−11
48	zf-C2H2	103	125	38.1	1.9e−07	KRAB	13	75	159.5	5.6e−44
	zf-C2H2	131	153	31.8	1.6e−05	zf-C2H2	182	200	−1.9	1.3e+02
	zf-C2H2	159	181	34.5	2.4e−06	zf-C2H2	210	232	21.8	0.017
	zf-C2H2	187	209	35.5	1.2e−06	zf-C2H2	238	260	33.1	6.3e−06
	zf-C2H2	215	237	30.8	3.2e−05	zf-C2H2	266	288	34	3.4e−06
	zf-C2H2	243	265	30.1	5.2e−05	zf-C2H2	294	316	37.9	2.3e−07
	zf-C2H2	271	293	28.6	0.00015	zf-C2H2	322	344	37.9	2.2e−07
	zf-C2H2	299	321	29.6	7.4e−05	zf-C2H2	350	372	36.1	7.8e−07
	zf-C2H2	352	374	20.2	0.051	zf-C2H2	378	400	34.9	1.8e−06
	zf-C2H2	380	402	37.2	3.7e−07	zf-C2H2	406	428	35.3	1.4e−06
	zf-BED	393	431	5.6	0.32	zf-C2H2	434	456	35.3	1.4e−06
	zf-C2H2	408	430	30.2	4.7e−05	zf-C2H2	462	484	33.8	3.9e−06
	zf-C2H2	436	458	38.1	2e−07	zf-C2H2	490	512	37.1	4e−07
	zf-C2H2	464	486	37.6	2.9e−07	zf-C2H2	518	540	15.7	1.1
	zf-C2H2	492	514	25.4	0.0013	zf-C2H2	546	568	32.4	1.1e−05
	zf-C2H2	544	566	22.8	0.0079	zf-C2H2	574	596	34	3.4e−06
	zf-C2H2	572	594	37.2	3.7e−07	zf-C2H2	602	624	34.2	3.1e−06
	zf-BED	585	623	3.9	0.5	zf-C2H2	630	652	37.9	2.2e−07
	zf-C2H2	600	622	34.9	1.9e−06	zf-C2H2	658	680	36.1	7.8e−07
	zf-C2H2	628	650	38.2	1.9e−07	zf-C2H2	686	708	34.9	1.8e−06
	zf-C2H2	656	678	32.6	9.3e−06	zf-C2H2	714	736	35.3	1.4e−06
	zf-C2H2	684	706	24.4	0.0026	zf-C2H2	742	764	36.8	5e−07
	zf-C2H2	737	759	20.9	0.03	zf-C2H2	770	792	35	1.7e−06
						zf-C2H2	798	820	37.3	3.4e−07
						zf-C2H2	826	848	34.8	2e−06
						zf-C2H2	854	876	37	4.2e−07
						zf-C2H2	885	904	11.3	6.3
						zf-C2H2	910	932	38.2	1.9e−07
						zf-C2H2	938	960	36.5	5.9e−07
						zf-C2H2	966	988	34.3	2.8e−06
						zf-C2H2	994	1016	39	1.1e−07
						zf-C2H2	1022	1044	35.1	1.6e−06
						zf-C2H2	1050	1072	33.8	3.8e−06
						zf-C2H2	1078	1100	39	1.1e−07
						zf-C2H2	1106	1128	32.9	7.5e−06
						zf-C2H2	1134	1156	19.5	0.081
49	SAM	732	795	79.8	5.7e−20	ank	223	256	12.3	6.6
						ank	257	289	28	0.00022
						ank	290	323	12.6	6
						ank	324	356	21.5	0.021
						ank	357	389	37.1	4.1e−07
						SH3	548	602	36.5	6.2e−07
						PDZ	645	738	22.2	0.007
						SAM	1741	1804	80.5	3.5e−20
50	PDZ	48	130	51.2	2.3e−11	PDZ	53	135	48.8	1.2e−10
	PDZ	148	233	54.1	3e−12	PDZ	153	238	53.5	4.7e−12
	PDZ	248	331	42.3	1.1e−08	PDZ	253	336	46.7	5.3e−10
	PDZ	456	544	42.5	9.2e−09	PDZ	458	546	39	1.1e−07
	PDZ	557	640	62	1.3e−14	PDZ	559	642	55.7	1e−12
	PDZ	656	737	69.4	7.5e−17	PDZ	658	739	65.6	1e−15
	PDZ	941	1022	36.4	6.5e−07	PDZ	942	1023	26.6	0.0006
52	ank	679	714	6.6	33	ank	23	55	3.5	80
	ank	717	749	6.1	38	ank	56	88	43.2	5.7e−09
	ank	750	782	33.2	6.2e−06	ank	89	121	45.3	1.4e−09
	ank	783	815	4.8	55	ank	122	154	42.7	8.4e−09
	ank	823	859	4.1	67	ank	155	183	14.3	2.9
	ank	861	893	31.5	1.9e−05	ank	184	216	18.6	0.15
	ank	894	926	40	5.4e−08	ank	217	249	36.1	8e−07
	ank	927	959	42.3	1.1e−08	ank	250	282	45.6	1.1e−09
	ank	960	992	35.7	1.1e−06	ank	283	315	39.3	8.4e−08
	ank	993	1025	38.1	2e−07	ank	316	348	39.3	8.8e−08
	ank	1026	1058	12.9	5.5	ank	349	381	39.1	9.7e−08
	TPR	1072	1105	0.2	43	ank	382	414	46.6	5.5e−10
	TPR	1119	1152	16.4	0.7	ank	415	447	39.7	6.7e−08
	TPR	1153	1186	25.6	0.0012	ank	448	480	42.6	9.1e−09
						ank	481	513	40.2	4.7e−08
						ank	514	546	49.7	6.5e−11
						ank	547	579	43.6	4.5e−09
						ank	580	612	38.3	1.7e−07
						ank	613	645	47.2	3.6e−10
						ank	646	678	36.3	7e−07
						ank	679	711	42.7	8e−09
						ank	712	744	43.8	3.9e−09
						ank	745	777	41	2.8e−08
						ank	778	810	2.2	1.1e+02
						ZU5	983	1087	229.6	4.4e−65
						death	1479	1562	111.4	1.8e−29
53	sosui	16	38	—	—	sosui	9	30	—	—
	Glyco_transf_29	218	512	243.2	3.6e−69	Glyco_transf_29	107	401	448	8.2e−131
54	GSPII_E	1032	1046	8.6	0.092
55	UQ_con	971	1136	0.9	2.6e−06
57	Ca_channel_B	17	75	6.5	0.07
58	DAO	43	404	−61.8	0.00044	DAO	42	403	−70.5	0.0014
	Phytoene_dh	45	406	−331.1	0.19	Phytoene_dh	44	405	−327.7	0.14
	GCV_T	523	863	556.6	1.6e−163	UPF0079	52	149	−47.5	0.26
						GCV_T	522	862	593.6	1.2e−174
59	TPR	285	318	9.6	4.2	TPR	155	189	10.7	3.2
	TPR	327	360	21.6	0.018	TPR	198	231	30.9	3e−05
	TPR	369	402	8.6	5.4	TPR	240	273	31.3	2.2e−05
	TPR	435	468	19.9	0.06	TPR	282	315	15.1	1.1
	TPR	477	510	28.8	0.00013	TPR	324	357	30.1	5.3e−05
	TPR	519	552	23.6	0.0046	TPR	366	399	18.5	0.16
	TPR	561	594	36.5	6.1e−07	TPR	408	441	10.6	3.3
	TPR	603	636	19.7	0.071
60						Myb_DNA-binding	300	345	18.8	0.0011
61	ACBP	5	43	−18.1	0.014	ACBP	42	130	199.3	5.9e−56
	sosui	405	427	—	—	sosui	503	524	—	—
62	Dynein_heavy	494	1192	444.2	1.1e−129
63	myosin_head	147	282	81.9	3.4e−23	myosin_head	1208	1871	946.2	8.7e−281
	MyTH4	561	673	33.9	2.1e−06	IQ	1887	1907	22.5	0.0097
	SH3	1455	1511	8	0.013	IQ	1910	1930	26.1	0.00081
						MyTH4	2088	2195	71.8	1.4e−17
						MyTH4	3071	3185	108.4	1.4e−28
64	WD40	61	98	22.4	0.011	WD40	12	47	13.6	3.4
	WD40	110	145	12.9	4.7	WD40	64	100	19.9	0.059
	WD40	150	187	25.3	0.0014
	WD40	289	324	13.6	3.4
	WD40	341	377	19.9	0.059
65	sosui	152	174	—	—	pkinase	14	265	330.4	2e−95
						sosui	193	215	—	—
66	EGF	12	47	20.7	0.035	EGF	127	162	25.5	0.0013
	EGF	53	86	10.6	2.5	EGF	168	203	43.5	4.6e−09
						EGF	209	245	27.2	0.00038
						EGF	251	286	29.2	9.5e−05
						EGF	292	327	30.5	4e−05
						EGF	338	373	27.2	0.0004
						EGF	379	413	8.6	3.8
						EGF	419	454	29.3	9e−05
						EGF	524	555	21.3	0.022
						EGF	568	598	23.2	0.0061
						EGF	611	641	12.3	1.7
						EGF	645	686	8.2	4.1
						EGF	699	731	2.2	14
						EGF	735	773	4.3	9.2
						EGF	786	817	21.5	0.02
						EGF	830	860	21.4	0.022
						EGF	873	904	7.1	5.1
						EGF	917	947	17.7	0.27
						EGF	960	990	11.4	2.1
						EGF	1003	1033	22.8	0.0079
						EGF	1046	1076	21.6	0.018
						EGF	1089	1119	28.6	0.00015
						EGF	1132	1162	18.1	0.21
						EGF	1175	1205	21.8	0.017
						EGF	1209	1249	0.6	20
						EGF	1262	1292	14.5	1.1
						EGF	1305	1335	15	1
						EGF	1348	1378	22.5	0.0099
						EGF	1391	1421	19	0.12
						EGF	1434	1464	16.5	0.62
						EGF	1477	1507	9.2	3.3
						EGF	1520	1550	14.2	1.2
67	EGF	157	187	10.9	2.3	EGF	127	162	25.5	0.0013
	EGF	200	230	15.6	0.88	EGF	168	203	43.5	4.6e−09
	EGF	243	273	27	0.00045	EGF	209	245	27.2	0.00038
	EGF	286	316	25.5	0.0012	EGF	251	286	29.2	9.5e−05
	EGF	329	359	26.1	0.0008	EGF	292	327	30.5	4e−05
	EGF	372	402	20	0.056	EGF	338	373	27.2	0.0004
	EGF	416	448	11.8	1.9	EGF	379	413	8.6	3.8
	EGF	461	491	23.2	0.006	EGF	419	454	29.3	9e−05
	EGF	504	534	24.9	0.0019	EGF	524	555	21.3	0.022
	EGF	547	577	23.9	0.0038	EGF	568	598	23.2	0.0061
	EGF	590	620	12.6	1.6	EGF	611	641	12.3	1.7
	EGF	633	663	21.6	0.019	EGF	645	686	8.2	4.1
	EGF	676	708	9.9	2.9	EGF	699	731	2.2	14
	EGF	721	751	15.8	0.84	EGF	735	773	4.3	9.2
	EGF	764	794	22	0.014	EGF	786	817	21.5	0.02
	EGF	807	837	18.6	0.15	EGF	830	860	21.4	0.022
	EGF	850	880	18.2	0.19	EGF	873	904	7.1	5.1
	sosui	909	930	—	—	EGF	917	947	17.7	0.27
						EGF	960	990	11.4	2.1
						EGF	1003	1033	22.8	0.0079
						EGF	1046	1076	21.6	0.018
						EGF	1089	1119	28.6	0.00015
						EGF	1132	1162	18.1	0.21
						EGF	1175	1205	21.8	0.017
						EGF	1209	1249	0.6	20
						EGF	1262	1292	14.5	1.1
						EGF	1305	1335	15	1
						EGF	1348	1378	22.5	0.0099
						EGF	1391	1421	19	0.12
						EGF	1434	1464	16.5	0.62
						EGF	1477	1507	9.2	3.3
						EGF	1520	1550	14.2	1.2
68						WH2	399	417	8.5	22
						WH2	463	480	17.2	0.38
69	EGF	45	81	43.6	4.4e−09	EGF	20	56	36.8	4.9e−07
	TB	96	137	42.8	7.5e−09	EGF	62	98	37.7	2.6e−07
	EGF	162	198	26.5	0.00063	EGF	104	138	31.9	1.5e−05
	EGF	204	241	19.5	0.078	TB	153	194	56.9	4.5e−13
						EGF	207	243	39.4	8e−08
						TB	258	299	74.2	2.8e−18
						EGF	325	361	23.4	0.0055
						EGF	367	404	18.3	0.18
						EGF	410	446	35.7	1e−06
						EGF	452	488	24.6	0.0024
						EGF	494	529	27.7	0.00027
						EGF	535	571	28.8	0.00013
						EGF	577	613	17.4	0.33
						EGF	619	654	26.5	0.00061
						Plexin_repeat	625	674	4.9	0.73
						EGF	660	695	31.8	1.5e−05
						EGF	701	737	34.8	2e−06
70	zf-C2H2	44	67	6.3	19	zf-C2H2	21	43	20.9	0.031
	zf-C2H2	101	124	4.3	31	zf-C2H2	50	73	17.6	0.3
	zf-C2H2	202	225	16.1	0.87	zf-C2H2	79	102	15	1.8
	zf-C2H2	247	270	5.6	23
	zf-C2H2	309	332	15.5	1.3
	zf-C2H2	392	415	16.1	0.84
	zf-C2H2	429	452	6.3	20
	zf-C2H2	499	522	1.8	55
	zf-C2H2	578	601	4.5	29
	zf-C2H2	624	646	18.9	0.12
	zf-C2H2	700	722	11	6.7
	zf-C2H2	784	806	19.1	0.1
	zf-C2H2	818	840	20.9	0.031
	zf-C2H2	847	870	17.6	0.3
	zf-C2H2	876	899	15	1.8
	zf-C2H2	984	1007	3.8	35
	zf-C2H2	1013	1036	15.4	1.4
	zf-C2H2	1047	1069	12.8	4.4
	zf-C2H2	1093	1115	20.5	0.039
	zf-C2H2	1121	1144	16.6	0.58
	zf-C2H2	1207	1229	27.4	0.00034

TABLE 5


Complete notation of each functional domain

Abbreviated notation	Complete notation

A_deamin	Adenosine-deaminase (editase) domain
ACBP	Acetyl CoA binding protein
AMP-binding	AMP-binding enzyme
ank	Ank repeat
Armadillo_—	Armadillo/beta-catenin-like repeat
seg
BNR	BNR repeat
BR01	BR01-like domain
BTB	BTB/POZ domain
C2	C2 domain
Ca_channel_B	Dihydropyridine sensitive L-type calcium
	channel (Beta subunit)
cadherin	Cadherin domain
chromo	‘chromo’ (CHRomatin Organization MOdifier)
	domain
COLFI	Fibrillar collagen C-terminal domain
Collagen	Collagen triple helix repeat (20 copies)
DAO	FAD dependent oxidoreductase
DDHD	DDHD domain
death	Death domain
DENN	DENN (AEX-3) domain
Dynein_heavy	Dynein heavy chain
EGF	EGF-like domain
F5_F8_type_C	F5/8 type C domain
FH2	Formin Homology 2 Domain
GCV_T	Glycine cleavage T-protein (aminomethyl
	transferase)
Glyco_hydro_47	Glycosyl hydrolase family 47
Glyco_transf_29	Glycosyltransferase family 29
	(sialyltransferase)
GRAM	GRAM domain
GSPII_E	Bacterial type II secretion system protein
HEAT	HEAT repeat
HR1	Hr1 repeat motif
integrin_B	Integrins, beta chain
IQ	IQ calmodulin-binding motif
Kelch	Kelch motif
KRAB	KRAB box
Laminin_EGF	Laminin EGF-like
	(Domains III and V)
laminin_G	Laminin G domain
Laminin_Nterm	Laminin N-terminal
	(Domain VI)
myb_DNA-binding	Myb-like DNA-binding domain
myosin_head	Myosin head (motor domain)
MyTH4	MyTH4 domain
Oxysterol_BP	Oxysterol-binding protein
PDZ	PDZ domain
	(Also known as DHR or GLGF).
PH	PH domain
Phytoene_dh	Phytoene dehydrogenase related enzyme
pkinase	Protein kinase domain
Plexin_repeat	Plexin repeat
Pro_dh	Proline dehydrogenase
RasGEF	RasGEF domain
RasGEFN	Guanine nucleotide exchange factor
	for Ras-like GTPases;
	N-terminal motif
RhoGAP	RhoGAP domain
RhoGEF	RhoGEF domain
SAM	SAM domain
	(Sterile alpha motif)
Sema	Sema domain
SH3	SH3 domain
spectrin	Spectrin repeat
TB	TB domain
TIG	IPT/TIG domain
TPR	TPR Domain
Troponin	Troponin
TSPN	Thrombospondin
	N-terminal-like domain
tubulin	Tubulin/FtsZ family
tubulin-binding	Tau and MAP protein,
	Tubulin-binding repeat
UBA	UBA/TS-N domain
UPF0079	Uncharacterised P-loop
	hydrolase UPF0079
UQ_con	Ubiquitin-conjugating enzyme
WD40	WD domain, G-beta repeat
WH2	WH2 motif
WW	WW domain
Xlink	Extracellular link domain
zf-BED	BED zinc finger
zf-C2H2	Zinc finger, C2H2 type
ZU5	ZU5 domain

(5) Expression Site

Expressions in the tissue and the sites of the brain were examined by RT-PCR ELISA. Table 6 lists the sites showing the strongest expression.

(6) Chromosome Position

Using the DNA nucleotide sequences of the clones as queries, an analysis program BLASTN 2.0.14 (the above-mentioned “Gapped BLAST and PSI-BLAST: a new generation of protein database search programs”) was run on human genome sequences corresponding to the library of known sequences, Genbank release 119. The description of the chromosome number from which the clone had been derived was extracted from the definitions for the matched clones as listed in Table 6.

TABLE 6


Expression site of each gene and chromosome
position of homologous gene

		Chromosome
	Expression site	position

SEQ ID NO:	Tissue	Brain	Position

1	Brain	Caudate nucleus	—
2	Ovary	Cerebellum	—
3	Brain	—	—
4	Brain	Nucleus of hypothalamus	—
5	Brain	Caudate nucleus	2
6	Kidney	Caudate nucleus	—
7	Ovary	Spinal cord	20
8	Brain	Substantia nigra	7
9	Ovary	Amygdaloid body	—
10	Brain	Caudate nucleus	20
11	Ovary	Thalamus	7
12	Brain	Amygdaloid body	—
13	Ovary	Substantia nigra	9
14	Brain	Caudate nucleus	22
15	Brain	Thalamus	19
16	Kidney	Spinal cord	7
17	Skeletal muscle	Caudate nucleus	19
18	Brain	Spinal cord	7
19	Heart	Amygdaloid body	9
20	—	—	—
21	Brain	Hippocampus	17
22	Brain	Spinal cord	—
23	Orchis	Corpus callosum	—
24	Kidney	Thalamus	7
25	—	Nucleus of hypothalamus	22
26	—	—	3
27	Brain	Thalamus	22
28	Brain	Cerebellum	2
29	Brain	Spinal cord	16
30	Brain	Caudate nucleus	8
31	Ovary	Nucleus of hypothalamus	1
32	Heart	Amygdaloid body	18
33	Brain	Caudate nucleus	3
34	Ovary	Caudate nucleus	2
35	Ovary	Thalamus	7
36	Ovary	Spinal cord	11
37	Brain	Cerebellum	14
38	Brain	Cerebellum	—
39	Ovary	Spinal cord	17
40	Ovary	Corpus callosum	16
41	Ovary	Cerebellum	—
42	Kidney	Spinal cord	—
43	Brain	Thalamus	19
44	Brain	Corpus callosum	—
45	Kidney	Cerebellum	X
46	Brain	Spinal cord	22
47	Brain	Amygdaloid body	3
48	Ovary	Spinal cord	17
49	Skeletal muscle	Amygdaloid body	22
50	Skeletal muscle	Thalamus	3
51	Skeletal muscle	Spinal cord	—
52	Ovary	Hippocampus	—
53	Brain	Cerebellum	2
54	—	—	3
55	Brain	Thalamus	—
56	Kidney	Spinal cord	5
57	Brain	Caudate nucleus	11
58	—	—	16
59	Kidney	Spinal cord	3
60	—	—	5
61	Brain	Substantia nigra	10
62	Ovary	Caudate nucleus	11
63	—	—	17
64	Skeletal muscle	Spinal cord	14
65	Brain	Substantia nigra	19
66	Brain	Caudate nucleus	3
67	Brain	Hippocampus	5
68	—	—	16
69	—	—	19
70	Brain	Caudate nucleus	19

According to the above information on homology, homologous genes, each domain, expression sites, chromosome positions and the like, a person skilled in the art can predict based on the grounds shown in Table 1 that the DNAs or the genes of the present invention respectively have each function described in Table 1.

INDUSTRIAL APPLICABILITY

A single nucleotide polymorphism, SNP, which is a change in one base (nucleotide) among individuals in the DNA or the gene of the present invention, can be found by performing PCR using synthetic DNA primers prepared based on the nucleotide sequence of the DNA or the gene of the present invention or a part thereof, and using chromosome DNA extracted from human blood or tissue so as to determine the nucleotide sequence of the product. Therefore, individual constitution or the like can be predicted, which enables the development of a pharmaceutical preparation suitable for each individual.

Further, when ortholog (homolog, counterpart) genes for the DNA or the gene of the present invention in model organisms, such as mice, are isolated with cross hybridization, for example, these genes are knocked out to produce human disease model animals, so that the causative genes which cause human diseases can be searched and identified.

DNA chip, polypeptide chip and antibody chip can be respectively prepared by arraying the DNAs and the polypeptides of the present invention, and antibodies for the polypeptides of the present invention. Specifically, novel DNAs or genes obtained by the present invention are assembled on a so-called DNA chip, and then probes prepared using blood or tissue derived from patients with diseases that relate to the brain, such as mental disease, or as a control using blood or tissue from healthy individuals are hybridized to the chip, so that the chip can be applied to diagnosis and treatment for the diseases. Moreover, antibody chip, on which the antibodies against the polypeptides of the present invention are thoroughly prepared and arrayed, can be applied to diagnosis, treatment of diseases and the like through proteome analysis, such as detection of a difference in expression amount of a protein between a patient and a healthy individual.

The present application asserts priority based on the three specifications of Japanese Patent Application Nos. 2000-389742, 2001-95524 and 2001-127066, and includes by reference all of the contents as disclosed in these specifications.

Claims

1. DNA comprising a nucleotide sequence encoding a polypeptide (a) or (b) as follows:

(a) a polypeptide comprising an amino acid sequence which is identical or substantially identical to an amino acid sequence represented by any one of SEQ ID NOS: 1 to 70;

(b) a polypeptide which comprises an amino acid sequence derived from an amino acid sequence represented by any one of SEQ ID NOS: 1 to 70 by deletion, substitution or addition of a section of amino acid(s), and has biological activity which is substantially the same characteristic with the function of the polypeptide of (a).

2. DNA hybridizing to the DNA of claim 1 under stringent conditions, and encoding a polypeptide having biological activity which is substantially the same characteristic with the function of the polypeptide of (a) of claim 1.

3. A gene construct containing the DNA of claim 1 or 2.

4. A polypeptide (a) or (b) as follows:

(a) a polypeptide comprising an amino acid sequence which is identical or substantially identical to an amino acid sequence represented by any one of SEQ ID NOS: 1 to 70;

(b) a polypeptide comprising an amino acid sequence derived from an amino acid sequence represented by any one of SEQ ID NOS: 1 to 70 by deletion, substitution or addition of a section of amino acids, and having biological activity which is substantially the same characteristic with the function of the polypeptide of (a).

5. A recombinant polypeptide, which is encoded by the gene construct of claim 3.

6. An antibody against the polypeptide of claim 4 or 5.

7. A DNA chip, on which the DNAs of claim 1 or 2 are arrayed.

8. A polypeptide chip, on which the polypeptides of claim 4 or 5 are arrayed.

9. An antibody chip, on which the antibodies of claim 6 are arrayed.

Resources

Sources:

United States Patent and Trademark Office - verify current appl. status at the USPTO↗

Similar patent applications:

» 20070099213
Corynebacterium glutamicum genes encoding novel proteins
» 20080050386
Novel proteins, gene encoding the same and method of utilization thereof
» 20060211848
Novel genes encoding proteins having prognostic, diagnostic, preventive, therapeutic and other uses
» 20050283851
Novel genes encoding proteins involved in proanthocyanidin synthesis
» 20070042474
Corynebacterium glutamicum genes encoding novel proteins
» 20070059744
Corynebacterium glutamicum genes encoding novel proteins
» 20050250144
Novel genes encoding protein kinase/protein phosphatase
» 20050019810
Novel genes encoding proteins having prognostic, diagnostic, preventive, therapeutic and other uses
» 20060205034
Novel genes encoding proteins having prognostic, diagnostic, preventive, therapeutic, and other uses
» 20080213778
Novel genes encoding proteins having prognostic, diagnostic, preventive, therapeutic, and other uses

Recent applications in this class:

» 20250163114 2025-05-22
TREATMENT OF AMYOTROPHIC LATERAL SCLEROSIS
» 20250163113 2025-05-22
PEPTIDES TARGETING THE INTERACTION BETWEEN KINDLIN-1 AND Beta-INTEGRIN
» 20250154213 2025-05-15
De Novo Designed Cortisol Biosensor
» 20250154212 2025-05-15
METHOD OF DISRUPTING MEMORY AND LIPOPEPTIDE FOR USE IN SUCH METHOD
» 20250129133 2025-04-24
FUSION PROTEIN, AND PREPARATION METHOD THEREFOR AND USE THEREOF
» 20250122253 2025-04-17
REGULATING MICROTUBULE DYNAMICS AS A THERAPEUTIC TARGET FOR NERVE REPAIR
» 20250115649 2025-04-10
COMPOSITIONS AND METHODS FOR TREATING OR PREVENTING FIBROSIS
» 20250101071 2025-03-27
BINDING PROTEINS SPECIFIC FOR CD32A
» 20250084137 2025-03-13
METHODS AND COMPOSITIONS FOR TREATING AND/OR PREVENTING A DISEASE OR DISORDER ASSOCIATED WITH ABNORMAL LEVEL AND/OR ACTIVITY OF THE IFP35 FAMILY OF PROTEINS
» 20250084136 2025-03-13
MOLECULES FOR ORGANELLE-SPECIFIC PROTEIN DELIVERY