[Zinc-fans] [Dock-fans] more valences per nitrogen than they should be?
Noel O'Boyle
baoilleach at gmail.com
Fri Sep 14 02:23:36 PDT 2007
Hello all,
I came across this problem some time ago (see email below from Jul
18). I've placed the list of molecules at
http://www.redbrick.dcu.ie/~noel/dodgyNs.txt.
I am an OpenBabel developer, and if you need any help using OpenBabel
to sort out problems in ZINC, I'd only be too happy to volunteer some
time. We have some nice canonicalisation code donated by eMolecules,
etc., etc.
Noel
==== original email =====
Dear John,
I mentioned in passing some problems with some of the structures in
the MOL2 files. I've now followed this up. The main problem seems to
be that there are molecules containing nitrogen atoms of type N.3
(this is specified in the file) but which have four bonds.
This must be an error (right?). Either these are of type N.4 (and have
a positive charge), or they are of type N.3 and have at most three
bonds. I think that it is the latter, and that there has been some
mistake by a structure generation program at an earlier stage in your
pipeline.
I looked for all examples of this in ZINC and have attached the
result. There are 97211 molecules with atoms of type N.3 but with 4
bonds. That's almost 5% (Out of 2021041).
Regards,
Noel
For the record, here's the Python code to create the attached file:
import glob
import pybel
import openbabel as ob
outputfile = open("dodgyNs.txt", "w")
for filename in glob.glob("gzipfiles/*.mol2"):
for mol in pybel.readfile("mol2", filename):
for atom in mol:
if atom.type == "N3": # Internal OB atom type (equivalent to N.3)
numbonds = len([1 for x in ob.OBAtomBondIter(atom.OBAtom)])
if numbonds == 4:
print >> outputfile, mol.title
break
outputfile.close()
==== end of original email =====
On 14/09/2007, John J. Irwin <jji at cgl.ucsf.edu> wrote:
> Hi -
>
> Thanks for your email. We know there are some "broken" molecules in
> ZINC. If you send me the list of molecules you found (to jji at
> cgl.ucsf.edu) I will see what I can do to put it right.
>
> John
>
>
> Scott Brozell wrote:
> > Hi,
> >
> > I am cc-ing Zinc-fans
> > http://blur.compbio.ucsf.edu/mailman/listinfo/zinc-fans
> >
> > Scott
> >
> > On Tue, 11 Sep 2007 burgosgu at ualberta.ca wrote:
> >
> >
> >> I'm just abour running a virtual screening and I just noticed that
> >> most of the compounds I downloaded from the zinc database have 4
> >> valences for nitrogen instead of 3, e.g. a nitrogen is bound to two
> >> carbons and two hydrogens. Is this correct? I'm afraid that this
> >> affect my dockings.
> >>
> > _______________________________________________
> > Dock-fans mailing list
> > Dock-fans at docking.org
> > http://blur.compbio.ucsf.edu/mailman/listinfo/dock-fans
> >
> _______________________________________________
> Zinc-fans mailing list
> Zinc-fans at docking.org
> http://blur.compbio.ucsf.edu/mailman/listinfo/zinc-fans
>
More information about the Zinc-fans
mailing list