CurlySMILES notations can have a recursive format: a notation
may have annotations containing a
dictionary key that require one
or more CurlySMILES notations as content. For example, the keys
slv ,
slt ,
cos and
cot ,
which provide
solvent,
solute,
cosolvent and
cosolute information,
expect a list of notations that encode solvent, solute, cosolvent
and cosolute structure, respectively. Such a list is represented
in the conjunct CurlySMILES notation
(ConjCN)
format. Three formats are possible:
list of combinatorial options,
list of alternate selections and
list of implied composition,
encoding
“and/or”,
“or” and
“and”
connected SMILES or CurlySMILES notations, respectively.
|
List of combinatorial options.
This is a list of comma-connected
SMILES or CurlySMILES notations.
The list represents chemical systems including compounds
individually and in any possible combination. For example,
OCCCCC,C(O)CCCC,CCC(O)CC
represents 1-, 2-, and 3-pentanol as well as the binary
and ternary combinations thereof.
|
List of alternate selections.
This is a list of SMILES or CurlySMILES notations connected by
vertical bar characters.
For example,
OCCCCC|C(O)CCCC|CCC(O)CC
represents the pool containing 1-, 2-, and 3-pentanol. Within
annotation context, this reads as “either with
1-pentanol or with 2-pentanol or with 3-pentanol.”
|
List of implied composition.
This is a list of SMILES or CurlySMILES notations connected by
dots. In fact, it is a single
notation (already defined in the original SMILES language),
consisting of subnotations for chemical species, which do
not show a formal bond between the two atoms denoted on either
side of a dot, but which may nevertheless interact at the
molecular level. For example,
OCCCCC.C(O)CCCC.CCC(O)CC
represents the ternary system 1-pentanol + 2-pentanol
+ 3-pentanol.
|
Context targeting example: The ConjCN
format allows compact and contextual targeting of topics, concepts
and data snippets in the chemical literature. To address
the inorganic salt KBr, dissolved in aqueous
solutions containing cationic tetradecyltrimethylammonium bromide
(TTABr)
or anionic sodium
dodecyl sulfate (SDS)—for
example, studied by Zhou and Hao (doi:
10.1021/je100905g),
the following notation applies:
[K+].[Br-]{aqcos=\
CCCCCCCCCCCCCC[N+](C)(C)(C).[Br-]|[Na+].CCCCCCCCCCCCOS(=O)(=O)[O-]}
The SMILES notation of KBr is annotated with marker
aq to indicate that KBr
occurs in aqueous solution. The two cosolvents of interest
(in this case, surfactants) are formally supplied as ConjCN
(pair of alternate surfactants) via key
cos .
|
_
__
__
__
__
Share on Tumblr
___
|