Bonsoir à tous ceux qui pourrait et qui voudrais bien m'aider !
Ma mission est la suivante :
A partir de fichiers PDB de structure de protéine "doigt de zinc" trouver les structures pour lesquelles la géométrie de coordination du zinc s'éloigne de façon
importante de la géométrie d'un tétraèdre parfait.
Pour se faire j'ai d'abords télécharger les identifiants PDB des doigts de zinc qui m'intéresse dans le fichier PS00028.txt (cf cis joins) :
Ensuite j'ai récupérer les fichiers PDB qui corespondent à ces identifiants:Code:f2 = open("PS00028.txt","r") prosite_lines = f2.readlines() f2.close() # Get pdb id from prosite records prosite_id = [] for line in prosite_lines: fields = line.split() if fields[0] == "3D": for ps_id in fields[1:]: prosite_id.append(ps_id[:4]) # Output the pdb id fichier = open("pdb_prosite.txt", "w") for pdbid in prosite_id: fichier.write(pdbid+"\n") fichier.close()
Et là ça deviens difficile ^^ A partir du script (cf en dessous) je dois récupérer les valeures d'angles entre le zinc et les 4 constitutants (2 histidine, 2 cystéines) qui interagissent avec lui.Code:#! /usr/bin/env python import urllib, re, os, sys, getopt f4=open("pdb_prosite.txt", "r") prosite_lines=f4.readlines() f4.close() prosite_id = [] for line in prosite_lines: fields = line.split() for ps_id in fields[0:]: prosite_id.append(ps_id[:4]) def usage(): print "\nusage: pdb_get [options] <code> " print " where [options] could be:" print " -p to retrieve PDB format (default)" print " -c to retrieve mmCIF format" print " -s to retrieve structure factors along with the PDB format coordinates" print " and <code> is the 4-character PDB entry code" def get_options(): pdb = 1 mmCIF = 0 struct_fact = 0 try: opts,args = getopt.getopt(sys.argv[1:],'hpcs') except: print 'Unrecognized Option: ', sys.argv[1:] usage() return pdb, mmCIF, struct_fact for o,a in opts: if o == '-h': usage() sys.exit(0) elif o == '-p': pdb = 1 elif o == '-c': pdb = 0 mmCIF = 1 elif o == '-s': struct_fact = 1 return pdb, mmCIF, struct_fact, args def main(): (pdb, mmCIF, struct_fact, args) = get_options() for pdbid in prosite_id: pdbid = pdbid.lower() if ( pdb == 1): print "\nDownloading %s.pdb.gz .........." % (pdbid), url = 'ftp://ftp.wwpdb.org/pub/pdb/data/structures/all/pdb/pdb%s.ent.gz' % pdbid filename = pdbid + '.pdb.gz' try: urllib.urlretrieve(url, filename) print "Uncompressing %s.pdb.gz" % pdbid os.system("gunzip %s.pdb.gz" % pdbid) except: print "Error retrieving %s" % url elif (mmCIF == 1): print "\nDownloading %s.cif.gz .........." % (code), url = 'ftp://ftp.wwpdb.org/pub/pdb/data/structures/all/mmCIF/%s.cif.gz' % code filename = code + '.cif.gz' try: urllib.urlretrieve(url, filename) print "Uncompressing %s.cif.gz" % code os.system("gunzip %s.cif.gz" % code) except: print "Error retrieving %s" % url if ( struct_fact == 1 ): print "\nDownloading r%ssf.ent.gz .........." % (code), url = 'ftp://ftp.wwpdb.org/pub/pdb/data/structures/all/structure_factors/r%ssf.ent.gz' % code filename = 'r' + code + 'sf.ent.gz' try: urllib.urlretrieve(url, filename) print "Uncompressing r%ssf.ent.gz" % code os.system("gunzip r%ssf.ent.gz" % code) except: print "Error retrieving %s" % url if __name__ == '__main__': main()
J'imagine que cela puisse se faire avec des outils de Biopython:Code:# This module test the pdb lib of BioPython # Goal : detect zinc coordinating residues and do some calculations on it from Bio.PDB import * import sys p = PDBParser(PERMISSIVE=1) ppb = PPBuilder() # pdbid = sys.argv[1] pdbid = '1g25' fname = "pdb_data/%s.pdb" % pdbid structure = p.get_structure(pdbid, fname) print "Number of models : %d " % len(structure) model = structure[0] chain = model['A'] # Get the sequence for pp in ppb.build_peptides(chain) : seq = pp.get_sequence() # Check for the presence of a ZN atom # Build a list of non-hydrogen atoms azn_list = [] hat_list = [] for residue in chain : for atom in residue : if atom.name == "ZN" : azn_list.append(atom) elif atom.name[0] <> "H" : hat_list.append(atom) print " Number of Zinc atom found : %d " % len(azn_list) print " Number of heavy atoms found : %d " % len(hat_list) # Get Zinc coordinating residues site_list = [] ns = NeighborSearch(hat_list) rd = 2.5 for zn_atom in azn_list : center = zn_atom.get_coord() res_coord = ns.search(center,rd,'R') site = [] for res in res_coord : print " %s %d " % (res.resname, res.id[1]) site.append(res.id[1]) site.sort() print "linker sequences : %s %s " % (seq[site[0]:site[1]-1], seq[site[2]:site[3]-1]) site_list.append(site) print "\n" znseq_list = [] ns = 1 for site in site_list : znseq = '' i=0 while i < len(seq) : car = ' ' for id in site : if i == (id-1) : car = chr(ns+48) znseq = znseq + car i+=1 znseq_list.append(znseq) ns += 1 print seq for znseq in znseq_list : print znseq
Seulement je n'arrive pas à insérer cette manière de calculer les angles dans le dernier script. Si quelqu'un a des idées ?Code:>>> vector1 = atom1.get_vector() >>> vector2 = atom2.get_vector() >>> vector3 = atom3.get_vector() >>> angle = calc_angle(vector1, vector2, vector3)
meeeeeeeerci !
-----