[Zinc-fans] [Dock-fans] more valences per nitrogen than they should be?

John J. Irwin jji at cgl.ucsf.edu
Fri Sep 14 12:19:32 PDT 2007


Noel

Thank you for your kind offer of assistance. I would like to move
towards some kind of community-based system whereby helpful and
enterprising people like you could "fix" problems in ZINC, even without
contacting me (wikizincia?). Unfortunately, we don't have a mechanism
for that yet. For now, I gratefully accept notification of problems, and
will redouble my effort to fix errors.

John


Noel O'Boyle wrote:
> Hello all,
>
> I came across this problem some time ago (see email below from Jul
> 18). I've placed the list of molecules at
> http://www.redbrick.dcu.ie/~noel/dodgyNs.txt.
>
> I am an OpenBabel developer, and if you need any help using OpenBabel
> to sort out problems in ZINC, I'd only be too happy to volunteer some
> time. We have some nice canonicalisation code donated by eMolecules,
> etc., etc.
>
> Noel
>
> ==== original email =====
> Dear John,
>
> I mentioned in passing some problems with some of the structures in
> the MOL2 files. I've now followed this up. The main problem seems to
> be that there are molecules containing nitrogen atoms of type N.3
> (this is specified in the file) but which have four bonds.
>
> This must be an error (right?). Either these are of type N.4 (and have
> a positive charge), or they are of type N.3 and have at most three
> bonds. I think that it is the latter, and that there has been some
> mistake by a structure generation program at an earlier stage in your
> pipeline.
>
> I looked for all examples of this in ZINC and have attached the
> result. There are 97211 molecules with atoms of type N.3 but with 4
> bonds. That's almost 5% (Out of 2021041).
>
> Regards,
>   Noel
>
> For the record, here's the Python code to create the attached file:
>
> import glob
>
> import pybel
> import openbabel as ob
>
> outputfile = open("dodgyNs.txt", "w")
> for filename in glob.glob("gzipfiles/*.mol2"):
>    for mol in pybel.readfile("mol2", filename):
>        for atom in mol:
>            if atom.type == "N3": # Internal OB atom type (equivalent to N.3)
>                numbonds = len([1 for x in ob.OBAtomBondIter(atom.OBAtom)])
>                if numbonds == 4:
>                    print >> outputfile, mol.title
>                    break
> outputfile.close()
>
> ==== end of original email =====
>
> On 14/09/2007, John J. Irwin <jji at cgl.ucsf.edu> wrote:
>   
>> Hi -
>>
>> Thanks for your email. We know there are some "broken" molecules in
>> ZINC. If you send me the list of molecules you found (to jji at
>> cgl.ucsf.edu) I will see what I can do to put it right.
>>
>> John
>>
>>
>> Scott Brozell wrote:
>>     
>>> Hi,
>>>
>>> I am cc-ing Zinc-fans
>>> http://blur.compbio.ucsf.edu/mailman/listinfo/zinc-fans
>>>
>>> Scott
>>>
>>> On Tue, 11 Sep 2007 burgosgu at ualberta.ca wrote:
>>>
>>>
>>>       
>>>> I'm just abour running a virtual screening and I just noticed that
>>>> most of the compounds I downloaded from the zinc database have 4
>>>> valences for nitrogen instead of 3, e.g. a nitrogen is bound to two
>>>> carbons and two hydrogens. Is this correct? I'm afraid that this
>>>> affect my dockings.
>>>>
>>>>         
>>> _______________________________________________
>>> Dock-fans mailing list
>>> Dock-fans at docking.org
>>> http://blur.compbio.ucsf.edu/mailman/listinfo/dock-fans
>>>
>>>       
>> _______________________________________________
>> Zinc-fans mailing list
>> Zinc-fans at docking.org
>> http://blur.compbio.ucsf.edu/mailman/listinfo/zinc-fans
>>
>>     


More information about the Zinc-fans mailing list