Summary
We describe an approach to protein structure comparison designed to detect distantly related proteins of similar fold, where the procedure must be sufficiently flexible to take into account the elasticity of protein folds without losing specificity. Protein structures are represented as a series of secondary structure elements, where for each element a local environment describes its relations with the elements that surround it. Secondary structures are then aligned by comparing their features and local environments. The procedure is illustrated with searches of a database of 468 protein structures in order to identify proteins of similar topology to porcine pepsin, porphobilinogen deaminase and serum amyloid P-component. In all cases the searches correctly identify protein structures of similar fold as the search proteins. Multiple cross-comparisons of protein structures allow the clustering of proteins of similar fold. This is exemplified with a clustering of α/β- and β-class protein structures. We discuss applications of the comparison and clustering of three-dimensional protein structures to comparative modelling and structure-based protein design.
Similar content being viewed by others
References
BrowneW.J., NorthA.C.T., PhillipsD.C., BrewK., VanamanT.C. and HillR.L., J. Mol. Biol., 42 (1969) 65.
GreerJ., J. Mol. Biol., 153 (1981) 1027.
BlundellT.L., SibandaB.L., SternbergM.J.E. and ThorntonJ.M., Nature, 326 (1987) 347.
SutcliffeM.J., HaneefI., CarneyD. and BlundellT.L., Protein Eng., 1 (1987) 377.
SutcliffeM.J., HaneefI. and BlundellT.L., Protein Eng., 1 (1987) 385.
JohnsonM.S., OveringtonJ.P. and BlundellT.L., J. Mol. Biol., 231 (1993) 735.
OveringtonJ.P., JohnsonM.S., ŠaliA. and BlundellT.L., Proc. R. Soc. London, Ser. B, 241 (1990) 132.
OveringtonJ.P., DonnellyD., JohnsonM.S., ŠaliA. and BlundellT.L., Protein Sci., 1 (1992) 216.
OveringtonJ.P., ZhuZ.-Y., ŠaliA., JohnsonM.S., SowdhaminiR., LouieG.V. and BlundellT.L., Biochem. Soc. Trans., 21 (1993) 597.
McLachlanA.D., J. Mol. Biol., 128 (1979) 49.
MatthewsB.W. and RossmannM.G., Methods Enzymol., 115 (1985) 397.
RossmannM.G. and ArgosP., J. Mol. Biol., 105 (1976) 75.
RossmannM.G. and ArgosP., J. Mol. Biol., 109 (1977) 99.
RemingtonS.J. and MatthewsB.W., Proc. Natl. Acad. Sci. USA, 75 (1978) 2180.
RemingtonS.J. and MatthewsB.W., J. Mol. Biol., 140 (1980) 77.
NeedlemanS.B. and WunschC.D., J. Mol. Biol., 48 (1970) 443.
SmithT.F. and WatermanM.S., J. Mol. Biol., 147 (1981) 195.
ArgosP., VingronM. and VogtG., Protein Eng., 4 (1991) 375.
ŠaliA. and BlundellT.L., J. Mol. Biol., 212 (1990) 403.
ZhuZ.-Y., ŠaliA. and BlundellT.L., Protein Eng., 5 (1992) 43.
TaylorW.R. and OrengoC.A., Protein Eng., 2 (1989) 505.
TaylorW.R. and OrengoC.A., J. Mol. Biol., 208 (1989) 1.
OrengoC.A. and TaylorW.R., J. Theor. Biol., 147 (1990) 517.
LeskA.M. and ChothiaC., J. Mol. Biol., 136 (1980) 225.
TramontanoA., ChothiaC. and LeskA.M., Protein Struct. Funct. Genet., 6 (1989) 382.
ChothiaC., LevittM. and RichardsonD., J. Mol. Biol., 105 (1977) 1.
SubbiahS., LaurentsD.V. and LevittM., Curr. Biol., 3 (1993) 1441.
VriendG. and SanderC., Protein Struct. Funct. Genet., 11 (1991) 52.
HolmL., OuzounisC., SanderC., TuparevG. and VriendG., Protein Sci., 1 (1992) 1691.
YeeD.P. and DillK.A., Protein Sci., 2 (1993) 884.
HolmL. and SanderC., J. Mol. Biol., 233 (1993) 123.
HolmL. and SanderC., FEBS Lett., 315 (1993) 301.
HolmL. and SanderC., Nature, 361 (1993) 309.
LeskA.M. and ChothiaC., J. Mol. Biol., 160 (1982) 325.
ChothiaC. and LeskA.M., J. Mol. Biol., 160 (1982) 309.
MurthyM.R.N., FEBS Lett., 168 (1984) 97.
RichardsF.M. and KundrotC.E., Protein Struct. Funct. Genet., 3 (1988) 71.
MitchellE.M., ArtymiukP.J., RiceD.W. and WillettP., J. Mol. Biol., 212 (1989) 151.
ArtymiukP.J., RiceD.W., MitchellE.M. and WillettP., Protein Eng., 4 (1989) 39.
ArtymiukP.J., GrindleyH.M., ParkJ.E., RiceD.W. and WillettP., FEBS Lett., 303 (1992) 48.
GrindleyH.M., ArtymiukP.J., RiceD.W. and WillettP., J. Mol. Biol., 229 (1993) 707.
OrengoC.A., BrownN.P. and TaylorW.R., Protein Struct. Funct. Genet., 14 (1992) 139.
OrengoC.A., FloresT.P., JonesD.T., TaylorW.R. and ThorntonJ.M., Curr. Biol., 3 (1993) 131.
OrengoC.A., FloresT.P., TaylorW.R. and ThorntonJ.M., Protein Eng., 6 (1993) 485.
KochI., KadenF. and SelbigJ., Protein Struct. Funct. Genet., 12 (1992) 314.
JohnsonM.S., SutcliffeM.J. and BlundellT.L., J. Mol. Evol., 30 (1990) 43.
JohnsonM.S., ŠaliA. and BlundellT.L., Methods Enzymol., 183 (1990) 670.
KabschW. and SanderC., Biopolymers, 22 (1983) 2577.
Smith, D.K. and Thornton, J.M., unpublished results.
ChouK.-C., NemethyG. and ScheregaH.A., J. Am. Chem. Soc., 106 (1984) 3161.
SowdhaminiR., SrinivasanN., RamakrishnanC. and BalaramP., J. Mol. Biol., 223 (1992) 845.
OobatakeM. and OoiT., J. Theor. Biol., 67 (1977) 567.
BronC. and KerboschJ., Commun. Assoc. Comput. Machinery, 16 (1973) 575.
FredmanM.L., Bull. Math. Biol., 46 (1984) 553.
FelsensteinJ., Evolution, 39 (1985) 783.
Zhu, Z.-Y., unpublished results.
FitchW.M. and MargoliashE., Science, 155 (1967) 279.
AndreevaN.S., FedorovA.A., GustchinaA.E., SchutzkeverN.E. and SafroM.G., Mol. Biol. (Moscow), 12 (1978) 704.
CooperJ.B., KhanG., TaylorG., TickleI.J. and BlundellT.L., J. Mol. Biol., 214 (1990) 199.
Abad-ZapateroC., RydelT.J. and EricksonJ., Protein Struct. Funct. Genet., 8 (1990) 62.
SieleckiA.R., HayakawaK., FujinagaM., MurphyM.E.P., FraserM., MuirA.K., CarilliC.T., LewickiJ.A., BaxterJ.D. and JamesM.N.G., Science, 243 (1989) 1346.
DhanarajV., DealwisC.G., FrazaoC., BadassoM., SibandaB.L., TickleI.J., CooperJ.B., DriessenH.P.C., NewmanM., AguilarC., WoodS.P., BlundellT.L., HobartP.M., GeogheganK.F., AmmiratiM.J., DanleyD.E., O'ConnorB.A. and HooverD.J., Nature, 357 (1992) 466.
GillilandG.L., WinborneE.L., NachmanJ. and WlodawerA., Protein Struct. Funct. Genet., 8 (1990) 82.
NewmanM., SafroM., FrazaoC., KhanG., ZdanovA., TickleI.J., BlundellT.L. and AndreevaN., J. Mol. Biol., 221 (1991) 1295.
NewmanM., WatsonF., RoychowdhuryP., JonesH., BadassoM., CleasbyA., WoodS.P., TickleI.J. and BlundellT.L., J. Mol. Biol., 230 (1993) 260.
BlundellT.L., JenkinsJ.A., SewellB.T., PearlL.H., CooperJ.B., TickleI.J., VeerapandianB. and WoodS.P., J. Mol. Biol., 211 (1990) 919.
JamesM.N.G. and SieleckiA.R., In JurnakF. and McPhersonA. (Eds.) Biological Macromolecules and Assemblies, Wiley, New York, NY, 1983, pp. 43–60.
SugunaK., PadlanE.A., SmithC.W., CarlsonW.D. and DaviesD.R., Proc. Natl. Acad. Sci. USA, 34 (1987) 7009.
Aguilar, C., Badasso, M., Cooper, J.B., Wood, S.P. and Blundell, T.L., in preparation.
FitzgeraldP.M.D., McKeeverB.M., VanMiddlesworthJ.F., SpringerJ.P., HeimbachJ.C., LeuC.-T., HerberW.K., DixonR.A.F. and DarkeP.L., J. Biol. Chem., 265 (1990) 14209.
PearlL.H. and TaylorW.R., Nature, 329 (1987) 351.
MillerM., JaskolskiM., RaoJ.K.M., LeisJ. and wlodawerA., Nature, 337 (1989) 576.
JaskolskiM., MillerM., RaoJ.K.M., LeisJ. and WlodawerA., Biochemistry, 29 (1990) 5889.
NaviaM.A., FitzgeraldP.M.D., McKeeverB.M., LeuC.-T., HimbachJ.C., HerberW.K., SigalI.S., DarkeP.L. and SpringerJ.P., Nature, 337 (1989) 615.
WlodawerA., MillerM., JaskolskiM., SathyanaranaB.K., BaldwinE., WeberI.T., SelkL.M., ClawsonL., SchneiderJ. and KentS.B.H., Science, 245 (1989) 616.
LapattoR., BlundellT.L., HemmingsA., OveringtonJ., WilderspinA., WoodS., MersonJ.R., WhittleP.J., DanleyD.E., GeogheganK.F., HawrylikS.J., LeesS.E., ScheldK.G. and HobartP.M., Nature, 342 (1989) 299.
OlendorfD.H., FoundlingS.I., WendoloskiJ.J., SedlacekJ., StropP. and SalemmeF.R., Protein Struct. Funct. Genet., 14 (1992) 382.
LouieG.V., BrownlieP.D., LambertR., CooperJ.B., BlundellT.L., WoodS.P., WarrenM.J., WoodcockS.C. and JordanP.M., Nature, 359 (1992) 33.
BakerE.N., RumballS.V. and AndersonB.F., Trends Biochem. Sci., 12 (1987) 350.
AndersonB.F., BakerH.M., NorrisG.E., RiceD.W. and BakerE.N., J. Mol. Biol., 209 (1989) 711.
SarraR., GarrattR., GorinskyB., JhotiH. and LindleyP., Acta Crystallogr., B46 (1991) 763.
SpurlinoJ., LuG.-Y. and QuiochoF.A., J. Biol. Chem., 266 (1991) 5202.
SackJ.S., TrakhanovS.D., TsigannikI.H. and QuiochoF.A., J. Mol. Biol., 206 (1989) 193.
SackJ.S., SaperM.A. and QuiochoF.A., J. Mol. Biol., 206 (1989) 171.
QuiochoF.A. and VyasN.K., Nature, 310 (1984) 381.
VyasN.K., VyasM.N. and QuiochoF.A., Science, 242 (1988) 1290.
MowbrayS.L. and ColeL.B., J. Mol. Biol., 225 (1992) 155.
Emsley, J., White, H.E., O'Hara, B.P., Oliva, G., Srinivasan, N., Tickle, I.J., Blundell, T.L., Pepys, M.B. and Wood, S.P., Nature, (1994) in press.
EinsparH., ParksE.H., SugunaK., SubramanianE. and SuddathF.L., J. Biol. Chem., 261 (1986) 16518.
HardmanK.D. and AinsworthC.F., Biochemistry, 11 (1972) 4910.
KeitelT., SimonO., BorrissR. and HeinemannU., Proc. Natl. Acad. Sci. USA, 90 (1993) 5287.
Srinivasan, N., White, H.E. and Blundell, T.L., in preparation.
MeyerE., ColeG., RadhakrishnanR. and EppO., Acta Crystallogr., B44 (1988) 26.
MoultJ., SussmanF. and JamesM.N.G., J. Mol. Biol., 182 (1985) 555.
VanRoeyP. and BeermanT.A., Proc. Natl. Acad. Sci. USA, 86 (1989) 6587.
PletnevV.Z., KuzinA.P. and MalininaL.V., Bioorg. Khim., 8 (1982) 1637.
SuhS.W., BathM.A., NaivaG.H., CohenG.H., RaoD.N., RudikoffS. and DaviesD.R., Protein Struct. Funct. Genet., 1 (1986) 74.
SkarzynskiT., MoodyP.C.E. and WonacottA.J., J. Mol. Biol., 193 (1987) 171.
HallM.D., LevittD.G. and BanaszakL.J., J. Mol. Biol., 226 (1992) 867.
VolzK. and MatsumuraP., J. Biol. Chem., 266 (1991) 15511.
StockA., MottonenJ.M., StockJ. and SchuttC.E., Nature, 344 (1989) 745.
PaiE.F., KrengelU., PetskoG.A., GoodyR.S., KabschW. and WittinghoferA., EMBO J., 9 (1990) 2351.
laCourT.F.M., NyborgJ., ThirupS. and ClarkB.F.C., EMBO J., 4 (1985) 2385.
SmithW.W., BurnetR.M., DarlingG.D. and LudwigM.L., J. Mol. Biol., 117 (1977) 195.
ShirakiaharaY. and EvansP.R., J. Mol. Biol., 204 (1988) 973.
EvansP.R., FarrantsG.W. and HudsonP.J., Phil. Trans. R. Soc. London, Ser. B., 53 (1981) 53.
StehleT. and SchulzG.E., J. Mol. Biol., 224 (1992) 1127.
BohnJ.T., FilmanD.J., MatthewsD.A., HamlinR.C. and KrautJ., J. Biol. Chem., 257 (1982) 13560.
SrinivasanN. and BlundellT.L., Protein Eng., 6 (1993) 501.
ŠaliA. and BlundellT.L., J. Mol. Biol., 234 (1993) 779.
ŠaliA., OveringtonJ.P., JohnsonM.S. and BlundellT.L., Trends Biochem. Sci., 15 (1990) 235.
JonesD.T., TaylorW.R. and ThorntonJ.M., Nature, 358 (1992) 86.
BowieJ.U., LüthyR. and EisenbergD., Science, 253 (1991) 164.
Sowdhamini, R. and Rufino, S.D., in preparation.
EvansS.V., J. Mol. Graphics, 11 (1993) 134.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Rufino, S.D., Blundell, T.L. Structure-based identification and clustering of protein families and superfamilies. J Computer-Aided Mol Des 8, 5–27 (1994). https://doi.org/10.1007/BF00124346
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/BF00124346