CurlySMILES has been designed as a chemical language to formulate
molecular structures and material architectures independent
from natural language and chemical terminology. Yet, it is
sometimes desirable to tie a CurlySMILES notation together
with a common name or a specific term. The
annotation dictionary keys,
cha, chc
and chn allow to do this by
associating an abbreviation or acronym, a code, and a name
with a notation.
The following example uses chn
to enhance the encoding of (E)-2-methylbut-2-enoic acid
by one of its other names, tiglic acid (derived from the Latin
word tiglium
for croton):
|
CC=C{E}C(C)C(=O)O{__chn=tiglic_acid}
|
tiglic acid for (E)-2-methylbut-2-enoic acid
|
|
There are other trivial names for this compound such as tiglinic acid
and cevadic acid. Which name to include in a CurlySMILES notation
(if any at all),
depends on context. If the notation annotates or links to a certain text
in which a non-systematic name is used, special name inclusion within a
CurlySMILES notation should be a valuabe service for future querying.
An acronym, such as DMSO for dimethyl sulfoxide, is included via
cha:
|
CS(=O)C{__cha=DMSO}
|
DMSO for dimethyl sulfoxide
|
|
A special code is included via chc,
illustrated for SCYX-7158 (Scynexis/DNDi/Anacor), a compound
evaluated in a drug-discovery pipeline[2]:
|
|
c1cc(F)cc(C(F)(F)F)c1C(=O)Nc2ccc3 \
C(C)(C)OB(O)c3c2{__chc=SCYX-7158}
|
SCYX-7158, a potential drug to treat sleeping sickness [2]
|
|
References
[1] |
A. Drefahl:
CurlySMILES: a chemical language
to customize and annotate encodings of molecular and
nanodevice structures.
J. Cheminf.
2011, 3:1;
doi:
10.1186/1758-2946-3-1
.
|
[2] |
A. A. Rowe:
Body Borers.
Chem. & Eng. News
2011,
89
(33),
32-35.
|
|
Format of an annotation:
{AMk1=v1;k2=v2;...;kn=vn}
where
AM is an annotion marker, and
ki=vi is a key/value pair.
|