[Zinc-fans] Zinc-fans Digest, Vol 37, Issue 1

John J. Irwin jji at cgl.ucsf.edu
Wed Jul 30 09:53:41 PDT 2008


Hi Jianping

jianping zhou wrote:
> Hi,
>
> I am new here. I can download molecules in SMILES or SDF format, but
> if I need more information such as compound name,  IUPAC name, 
> formula, and structure etc.., what would be the best way to get thse 
> information?
>
The structure is included in SDF format, and is implicit in the SMILES. 
The chemical formula is a simple digest of the SMILES. OpenBabel might 
do this, but you can write one yourself in a few lines of perl or 
python. IUPAC names can be generated from SMILES by Ogham (OpenEye, 
eyesopen.com) and other programs. Most compounds in ZINC do not have 
names other than their IUPAC names, and it is not our remit to keep 
track of the few that do. Please try ChemDB, 
http://cdb.ics.uci.edu/CHEM/Web/, and of course PubChem.

John
UCSF ZINC Team
> Thanks,
>
> -jian
>
> --- On *Mon, 6/16/08, zinc-fans-request at docking.org 
> /<zinc-fans-request at docking.org>/* wrote:
>
>     From: zinc-fans-request at docking.org <zinc-fans-request at docking.org>
>     Subject: Zinc-fans Digest, Vol 37, Issue 1
>     To: zinc-fans at docking.org
>     Date: Monday, June 16, 2008, 7:52 AM
>
>     Send Zinc-fans mailing list submissions to
>     	zinc-fans at docking.org
>
>     To subscribe or unsubscribe via the World Wide Web, visit
>
>     	http://blur.compbio.ucsf.edu/mailman/listinfo/zinc-fans
>     or, via email, send a message with subject or body 'help' to
>     	zinc-fans-request at docking.org
>
>     You can reach the person managing the list at
>     	zinc-fans-owner at docking.org
>
>     When replying, please edit your Subject line so it is more specific
>     than "Re: Contents of Zinc-fans digest..."
>
>
>     Today's Topics:
>
>        1. Re: total number of compounds in vendors subset -reg
>           (John J. Irwin)
>        2. Re: All-purchasable subset (John J. Irwin)
>        3. non-unique ZINC ids in Zinc7 (Jens Auer)
>
>
>     ----------------------------------------------------------------------
>
>     Message: 1
>     Date: Wed, 28 May 2008 07:30:06 -0700
>     From: "John J. Irwin" <jji at cgl.ucsf.edu>
>     Subject: Re: [Zinc-fans] total number of compounds in vendors subset
>     	-reg
>     To: rafi A <rafi4dd at gmail.com>
>     Cc: zinc-fans at docking.org
>     Message-ID:
>      <483D6C6E.1070101 at cgl.ucsf.edu>
>     Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
>     Hi Rafi
>
>     Thanks for your email and your interest in ZINC. Sorry to take so long 
>     to get back to you.
>
>     I have recently exported a fresh copy of Sigma Aldrich in ZINC 8 
>     (http://zinc8.docking.org). There are 17,931 molecules in the source 
>     catalogs, and 15186 in ZINC. We downloaded every SDF file we could find 
>     on the Sigma Aldrich website. I've ordered the CD, and will include any 
>     additional molecules that may be there.
>
>     Previously we have included the "rare" library from Sigma Aldrich,
>     based 
>     on files we received perhaps 5 years ago. There were nearly 200K of 
>     these. Since these are no longer available on the Sigma Aldrich website, 
>     they have been removed from ZINC. I think this change may account for 
>     some of the discrepancies you saw.
>
>     Good luck
>
>     John
>     UCSF ZINC
>      Team
>
>
>
>
>     rafi A wrote:
>     > Hello,
>     >  
>     >
>     > Where can we find the total number of compounds in a subset?
>     >
>     >  
>     >
>     > For example I want to download the vendors/sigma Aldrich subset.
>     >
>     >  
>     >
>     > In the table column, catalog information: Source entries; shows 295,562.
>     >
>     > Another column, ZINC information: Loaded; shows 115,595. So I expected 
>     > the total number of molecules to be either 295,000 or 115,000.
>     >
>     >  
>     >
>     > But when I downloaded the mid pH,( SMILES or mol2) it shows only 
>     > 14,449 molecules.
>     >
>     >  
>     >
>     > Did I misunderstood something. Or can you tell me where I can find the 
>     > total number of molecules in a subset before downloading.
>     >
>     >  
>     >
>     > Thanks in advance.
>     >
>     >  
>     >
>     > Best regards,
>     >
>     >
>      Rafi
>     >
>
>
>     ------------------------------
>
>     Message: 2
>     Date: Wed, 28 May 2008 08:04:29 -0700
>     From: "John J. Irwin" <jji at cgl.ucsf.edu>
>     Subject: Re: [Zinc-fans] All-purchasable subset
>     To: "Josmar R. da Rocha" <bije_br at yahoo.com.br>
>     Cc: zinc-fans at docking.org
>     Message-ID: <483D747D.6000107 at cgl.ucsf.edu>
>     Content-Type: text/plain; charset=UTF-8; format=flowed
>
>     Hi Josmar
>
>     I have exported "all purchasable" for ZINC 8, which now has 8.4M 
>     molecules. It should have nearly 10M in a month or so after I get a few 
>     more problems sorted out.
>
>     May I take this opportunity to point out that we have created several 
>     fun new subsets in ZINC that we have found useful, and you may too!
>
>     #17 - neutral fragments.  (51K of these)
>     #29 - CNS permeable (209K of these)
>     #33 - goldilocks - not too big, not too small, not too polar, not too 
>     greasy - just right. (almost 500K of
>      these)
>     #50 - stiff-solubles - fairly rigid fragments that are probably quite 
>     soluble.
>
>
>     Happy docking!
>
>     John
>     UCSF ZINC Team
>
>
>     Josmar R. da Rocha wrote:
>     > Dear Zinc-fans,
>     >
>     > I noticed that the subset "all-purchasable" that could be
>     downloaded 
>     > from Zinc 7 is no longer available in Zinc 8. I'd Like to know if the 
>     > only way to get this subset would be by downloading each one of the 
>     > files found in " By vendor /in stock" subsets or is there any
>     other way?
>     >
>     > Thanks in advance!
>     >
>     > Josmar Rocha
>     >
>     > ------------------------------------------------------------------------
>     > Abra sua conta no Yahoo! Mail 
>     >
>     <http://br.rd.yahoo.com/mail/taglines/mail/*http://br.mail.yahoo.com/>, 
>     > o ?nico sem limite de espa?o para armazenamento!
>     >
>      ------------------------------------------------------------------------
>     >
>     > _______________________________________________
>     > Zinc-fans mailing list
>     > Zinc-fans at docking.org
>     > http://blur.compbio.ucsf.edu/mailman/listinfo/zinc-fans
>     >   
>
>
>     ------------------------------
>
>     Message: 3
>     Date: Mon, 11 Feb 2008 11:49:57 +0100
>     From: Jens Auer <auer at bit.uni-bonn.de>
>     Subject: [Zinc-fans] non-unique ZINC ids in Zinc7
>     To: zinc-fans at docking.org
>     Message-ID: <1202726997.9493.14.camel at lsi-08>
>     Content-Type: text/plain; charset="utf-8"
>
>     Hi,
>
>     we have just found several molecules in the Zinc7 database which are
>     different in structure but have the same ZINC id. I've compiled a list
>     of ids where you find different compounds (acc. to a unique SMILES
>     string computed with MOE) under each id. As an example, I've also
>     attached two compounds with id ZINC00000308 where you can
>      see that they
>     really differ in structure.
>
>     Best regards,
>       Jens
>      
>     -- 
>     Jens Auer
>     Life Science Informatics
>     B-IT Intl. Center for Information Technology
>     Rheinische Friedrich-Wilhelms-University Bonn
>     Dahlmannstra?e 2
>     D-53113 Bonn
>     phone: +49 (228) 2699 314
>     email: auer at bit.uni-bonn.de
>     -------------- next part --------------
>     A non-text attachment was scrubbed...
>     Name: doppelte_ids.txt.bz2
>     Type: application/x-bzip
>     Size: 39034 bytes
>     Desc: not available
>     Url :
>     http://blur.compbio.ucsf.edu/pipermail/zinc-fans/attachments/20080211/fec82c40/attachment.bin
>
>     -------------- next part --------------
>     ZINC00000308
>       MOE2007           3D
>
>      55 58  0  0  1  0  0  0  0  0999 V2000
>         7.0050   11.3100   -1.5430 C   0  0  0  0  0  0  0  0  0  0  0  0
>         5.8640   11.9320   -0.7350 C   0  0  0  0  0  0  0  0  0  0  0  0
>         4.5060   12.0740   -2.7340 C   0  0  0  0  0  0  0  0  0  0  0 
>      0
>         4.3760   13.5990   -2.7240 C   0  0  0  0  0  0  0  0  0  0  0  0
>         4.3480   10.1300   -1.3000 C   0  0  0  0  0  0  0  0  0  0  0  0
>         4.1840    9.6930    0.1580 C   0  0  3  0  0  0  0  0  0  0  0  0
>         5.0500   10.0170    0.7340 H   0  0  0  0  0  0  0  0  0  0  0  0
>         4.0700    8.1920    0.2220 C   0  0  0  0  0  0  0  0  0  0  0  0
>         2.8500    7.5840    0.2470 C   0  0  0  0  0  0  0  0  0  0  0  0
>         2.7690    6.1840    0.3070 C   0  0  0  0  0  0  0  0  0  0  0  0
>         3.8520    5.4310    0.3440 N   0  0  0  0  0  0  0  0  0  0  0  0
>         5.0730    5.9720    0.3160 C   0  0  0  0  0  0  0  0  0  0  0  0
>         5.2210    7.3810    0.2560 C   0  0  0  0  0  0  0  0  0  0  0  0
>         6.5080    7.9410    0.2330 C   0  0  0  0  0  0  0  0  0  0  0  0
>         7.5990    7.1270    0.2680 C   0  0  0  0  0  0  0  0  0  0  0  0
>         7.4550    5.7420    0.3270 C   0  0  0  0  0  0  0  0  0  0  0  0
>         6.2230   
>      5.1640    0.3510 C   0  0  0  0  0  0  0  0  0  0  0  0
>         1.4360    5.5360    0.3340 C   0  0  0  0  0  0  0  0  0  0  0  0
>         0.2760    6.3280    0.3010 C   0  0  0  0  0  0  0  0  0  0  0  0
>        -0.9570    5.7590    0.3250 C   0  0  0  0  0  0  0  0  0  0  0  0
>        -1.0840    4.3590    0.3840 C   0  0  0  0  0  0  0  0  0  0  0  0
>        -2.3490    3.7470    0.4090 C   0  0  0  0  0  0  0  0  0  0  0  0
>        -2.4400    2.3900    0.4610 C   0  0  0  0  0  0  0  0  0  0  0  0
>        -1.2960    1.5940    0.5000 C   0  0  0  0  0  0  0  0  0  0  0  0
>        -0.0550    2.1520    0.4770 C   0  0  0  0  0  0  0  0  0  0  0  0
>         0.0790    3.5500    0.4180 C   0  0  0  0  0  0  0  0  0  0  0  0
>         1.3440    4.1580    0.3930 C   0  0  0  0  0  0  0  0  0  0  0  0
>         3.0020   10.2840    0.7020 O   0  0  0  0  0  0  0  0  0  0  0  0
>         7.9500   11.4750   -1.0250 H   0  0  0  0  0  0  0  0  0  0  0  0
>         6.8330   10.2390   -1.6500 H
>        0  0  0  0  0  0  0  0  0  0  0  0
>         7.0460   11.7720   -2.5290 H   0  0  0  0  0  0  0  0  0  0  0  0
>         5.9770   13.0160   -0.7230 H   0  0  0  0  0  0  0  0  0  0  0  0
>         5.8930   11.5520    0.2860 H   0  0  0  0  0  0  0  0  0  0  0  0
>         5.4110   11.7910   -3.2700 H   0  0  0  0  0  0  0  0  0  0  0  0
>         3.6380   11.6390   -3.2300 H   0  0  0  0  0  0  0  0  0  0  0  0
>         3.4840   13.8840   -2.1650 H   0  0  0  0  0  0  0  0  0  0  0  0
>         5.2560   14.0360   -2.2520 H   0  0  0  0  0  0  0  0  0  0  0  0
>         4.2950   13.9640   -3.7480 H   0  0  0  0  0  0  0  0  0  0  0  0
>         5.1980    9.6110   -1.7410 H   0  0  0  0  0  0  0  0  0  0  0  0
>         3.4440    9.8860   -1.8560 H   0  0  0  0  0  0  0  0  0  0  0  0
>         1.9480    8.1780    0.2220 H   0  0  0  0  0  0  0  0  0  0  0  0
>         6.6330    9.0130    0.1880 H   0  0  0  0  0  0  0  0  0  0  0  0
>         8.5880    7.5590    0.2490 H   0  0  0  0  0  0
>       0  0  0  0  0  0
>         8.3360    5.1170    0.3530 H   0  0  0  0  0  0  0  0  0  0  0  0
>         6.1300    4.0890    0.3970 H   0  0  0  0  0  0  0  0  0  0  0  0
>         0.3660    7.4030    0.2560 H   0  0  0  0  0  0  0  0  0  0  0  0
>        -1.8400    6.3810    0.2980 H   0  0  0  0  0  0  0  0  0  0  0  0
>        -3.2440    4.3500    0.3820 H   0  0  0  0  0  0  0  0  0  0  0  0
>        -3.4130    1.9220    0.4760 H   0  0  0  0  0  0  0  0  0  0  0  0
>        -1.3960    0.5200    0.5450 H   0  0  0  0  0  0  0  0  0  0  0  0
>         0.8230    1.5230    0.5050 H   0  0  0  0  0  0  0  0  0  0  0  0
>         2.2390    3.5530    0.4190 H   0  0  0  0  0  0  0  0  0  0  0  0
>         2.1900   10.0400    0.2370 H   0  0  0  0  0  0  0  0  0  0  0  0
>         4.5580   11.5930   -1.3380 N   0  3  0  0  0  0  0  0  0  0  0  0
>         3.8140   12.0510   -0.7990 H   0  0  0  0  0  0  0  0  0  0  0  0
>       1  2  1  0  0  0  0
>       1 29  1  0  0  0  0
>       1 30  1  0  0  0 
>      0
>       1 31  1  0  0  0  0
>       2 32  1  0  0  0  0
>       2 33  1  0  0  0  0
>       2 54  1  0  0  0  0
>       3  4  1  0  0  0  0
>       3 34  1  0  0  0  0
>       3 35  1  0  0  0  0
>       3 54  1  0  0  0  0
>       4 36  1  0  0  0  0
>       4 37  1  0  0  0  0
>       4 38  1  0  0  0  0
>       5  6  1  0  0  0  0
>       5 39  1  0  0  0  0
>       5 40  1  0  0  0  0
>       5 54  1  0  0  0  0
>       6  7  1  0  0  0  0
>       6  8  1  0  0  0  0
>       6 28  1  0  0  0  0
>       8  9  2  0  0  0  0
>       8 13  1  0  0  0  0
>       9 10  1  0  0  0  0
>       9 41  1  0  0  0  0
>      10 11  2  0  0  0  0
>      10 18  1  0  0  0  0
>      11 12  1  0  0  0  0
>      12 13  1  0  0  0  0
>      12 17  2  0  0  0  0
>      13 14  2  0  0  0  0
>      14 15  1  0  0  0  0
>      14 42  1  0  0  0  0
>      15 16  2  0  0  0  0
>      15 43  1  0  0  0  0
>      16 17  1  0  0  0  0
>      16 44  1  0  0  0  0
>      17 45  1  0  0  0  0
>      18 19  1  0  0  0  0
>      18 27  2  0  0  0  0
>      19 20  2  0  0  0  0
>      19 46  1  0
>       0  0  0
>      20 21  1  0  0  0  0
>      20 47  1  0  0  0  0
>      21 22  1  0  0  0  0
>      21 26  2  0  0  0  0
>      22 23  2  0  0  0  0
>      22 48  1  0  0  0  0
>      23 24  1  0  0  0  0
>      23 49  1  0  0  0  0
>      24 25  2  0  0  0  0
>      24 50  1  0  0  0  0
>      25 26  1  0  0  0  0
>      25 51  1  0  0  0  0
>      26 27  1  0  0  0  0
>      27 52  1  0  0  0  0
>      28 53  1  0  0  0  0
>      54 55  1  0  0  0  0
>     M  CHG  1  54   1
>     M  END
>     >  <name>
>     ZINC00000308
>
>     $$$$
>     ZINC00000308
>       MOE2007           3D
>
>      41 42  0  0  0  0  0  0  0  0999 V2000
>         2.4960    4.2870    0.4940 C   0  0  0  0  0  0  0  0  0  0  0  0
>         1.1980    3.5220    0.4960 C   0  0  0  0  0  0  0  0  0  0  0  0
>         0.0830    4.1400    0.7170 N   0  0  0  0  0  0  0  0  0  0  0  0
>         0.0820    5.3400    0.9250 O   0  0  0  0  0  0  0  0  0  0  0  0
>        -1.1120    5.9210    1.1480 C   0  0  0  0  0  0  0  0  0  0  0  0
>        -2.1250    5.2510
>         1.1370 O   0  0  0  0  0  0  0  0  0  0  0  0
>        -1.1820    7.2460    1.3840 N   0  0  0  0  0  0  0  0  0  0  0  0
>         1.1990    2.0680    0.2440 C   0  0  0  0  0  0  0  0  0  0  0  0
>         2.4020    1.4000    0.0050 C   0  0  0  0  0  0  0  0  0  0  0  0
>         2.4000    0.0410   -0.2300 C   0  0  0  0  0  0  0  0  0  0  0  0
>         1.2060   -0.6650   -0.2290 C   0  0  0  0  0  0  0  0  0  0  0  0
>         0.0020   -0.0050    0.0090 C   0  0  0  0  0  0  0  0  0  0  0  0
>        -0.0040    1.3570    0.2460 C   0  0  0  0  0  0  0  0  0  0  0  0
>        -1.1660   -0.7000    0.0090 O   0  0  0  0  0  0  0  0  0  0  0  0
>        -2.3720    0.0570    0.1350 C   0  0  0  0  0  0  0  0  0  0  0  0
>        -2.7000    0.2830    1.6250 C   0  0  0  0  0  0  0  0  0  0  0  0
>        -4.2450    0.1820    1.6920 C   0  0  0  0  0  0  0  0  0  0  0  0
>        -4.5580   -0.9530    0.6860 C   0  0  0  0  0  0  0  0  0  0  0  0
>        -3.5520   -0.7280   -0.4620 C   0  0
>       0  0  0  0  0  0  0  0  0  0
>         1.2100   -2.0040   -0.4610 O   0  0  0  0  0  0  0  0  0  0  0  0
>         2.4790   -2.6170   -0.6980 C   0  0  0  0  0  0  0  0  0  0  0  0
>         2.7050    4.6460   -0.5140 H   0  0  0  0  0  0  0  0  0  0  0  0
>         2.4180    5.1360    1.1730 H   0  0  0  0  0  0  0  0  0  0  0  0
>         3.3040    3.6330    0.8220 H   0  0  0  0  0  0  0  0  0  0  0  0
>        -0.3730    7.7810    1.3930 H   0  0  0  0  0  0  0  0  0  0  0  0
>        -2.0420    7.6650    1.5450 H   0  0  0  0  0  0  0  0  0  0  0  0
>         3.3330    1.9470    0.0040 H   0  0  0  0  0  0  0  0  0  0  0  0
>         3.3310   -0.4740   -0.4150 H   0  0  0  0  0  0  0  0  0  0  0  0
>        -0.9360    1.8710    0.4260 H   0  0  0  0  0  0  0  0  0  0  0  0
>        -2.2650    1.0160   -0.3730 H   0  0  0  0  0  0  0  0  0  0  0  0
>        -2.2400   -0.4910    2.2390 H   0  0  0  0  0  0  0  0  0  0  0  0
>        -2.3680    1.2710    1.9440 H   0  0  0  0  0  0  0  0 
>      0  0  0  0
>        -4.7070    1.1170    1.3760 H   0  0  0  0  0  0  0  0  0  0  0  0
>        -4.5740   -0.0900    2.6950 H   0  0  0  0  0  0  0  0  0  0  0  0
>        -5.5810   -0.8670    0.3190 H   0  0  0  0  0  0  0  0  0  0  0  0
>        -4.3980   -1.9280    1.1470 H   0  0  0  0  0  0  0  0  0  0  0  0
>        -4.0190   -0.1510   -1.2600 H   0  0  0  0  0  0  0  0  0  0  0  0
>        -3.2060   -1.6870   -0.8490 H   0  0  0  0  0  0  0  0  0  0  0  0
>         2.3400   -3.6850   -0.8680 H   0  0  0  0  0  0  0  0  0  0  0  0
>         2.9420   -2.1670   -1.5760 H   0  0  0  0  0  0  0  0  0  0  0  0
>         3.1220   -2.4680    0.1690 H   0  0  0  0  0  0  0  0  0  0  0  0
>       1  2  1  0  0  0  0
>       1 22  1  0  0  0  0
>       1 23  1  0  0  0  0
>       1 24  1  0  0  0  0
>       2  3  2  0  0  0  0
>       2  8  1  0  0  0  0
>       3  4  1  0  0  0  0
>       4  5  1  0  0  0  0
>       5  6  2  0  0  0  0
>       5  7  1  0  0  0  0
>       7 25  1  0  0  0  0
>       7 26  1  0  0  0 
>      0
>       8  9  1  0  0  0  0
>       8 13  2  0  0  0  0
>       9 10  2  0  0  0  0
>       9 27  1  0  0  0  0
>      10 11  1  0  0  0  0
>      10 28  1  0  0  0  0
>      11 12  2  0  0  0  0
>      11 20  1  0  0  0  0
>      12 13  1  0  0  0  0
>      12 14  1  0  0  0  0
>      13 29  1  0  0  0  0
>      14 15  1  0  0  0  0
>      15 16  1  0  0  0  0
>      15 19  1  0  0  0  0
>      15 30  1  0  0  0  0
>      16 17  1  0  0  0  0
>      16 31  1  0  0  0  0
>      16 32  1  0  0  0  0
>      17 18  1  0  0  0  0
>      17 33  1  0  0  0  0
>      17 34  1  0  0  0  0
>      18 19  1  0  0  0  0
>      18 35  1  0  0  0  0
>      18 36  1  0  0  0  0
>      19 37  1  0  0  0  0
>      19 38  1  0  0  0  0
>      20 21  1  0  0  0  0
>      21 39  1  0  0  0  0
>      21 40  1  0  0  0  0
>      21 41  1  0  0  0  0
>     M  END
>     >  <name>
>     ZINC00000308
>
>     $$$$
>
>     ------------------------------
>
>     _______________________________________________
>     Zinc-fans mailing
>      list
>     Zinc-fans at docking.org
>     http://blur.compbio.ucsf.edu/mailman/listinfo/zinc-fans
>
>
>     End of Zinc-fans Digest, Vol 37, Issue 1
>     ****************************************
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Zinc-fans mailing list
> Zinc-fans at docking.org
> http://blur.compbio.ucsf.edu/mailman/listinfo/zinc-fans
>   


More information about the Zinc-fans mailing list