Property talk:P233
Documentation
Simplified Molecular Input Line Entry Specification (canonical format)
Description | Simplified Molecular Input Line Entry Specification - simplified molecular input line entry specification (Q466769) | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
Represents | simplified molecular input line entry specification (Q466769) | |||||||||
Data type | String | |||||||||
Domain | According to this template:
chemical substance (Q79529)
According to statements in the property:
When possible, data should only be stored as statementstype of chemical entity (Q113145171), chemical element (Q11344), isotope (Q25276), functional group (Q170409), structural class of chemical entities (Q47154513) or group of isomeric entities (Q15711994) | |||||||||
Allowed values | According to this template:
complex text string
According to statements in the property:
When possible, data should only be stored as statements[A-Za-z0-9+\-\*=#$:().>/\\\[\]%]+ | |||||||||
Example | (±)-3-carene (Q19414) → CC2(C)C\1CCC(C)/C=C/12 gold (Q897) → [Au&zoom=2.0&annotate=cip [Au]] ethanol (Q153) → CCO | |||||||||
Formatter URL | https://www.simolecule.com/cdkdepict/depict/bow/svg?smi=$1&zoom=2.0&annotate=cip https://chemapps.stolaf.edu/jmol/jmol.php?model=$1 https://cactus.nci.nih.gov/chemical/structure/$1/file?format=sdf&get3d=true | |||||||||
Robot and gadget jobs | Auto URL, e.g. http://chemapps.stolaf.edu/jmol/jmol.php?model=CCO | |||||||||
Tracking: same | no label (Q32085237) | |||||||||
Tracking: differences | no label (Q20636208) | |||||||||
Tracking: usage | Category:Pages using Wikidata property P233 (Q20636212) | |||||||||
Tracking: local yes, WD no | no label (Q20636205) | |||||||||
See also | isomeric SMILES (P2017), SMARTS notation (P8533) | |||||||||
Lists |
| |||||||||
Proposal discussion | [not applicable Proposal discussion] | |||||||||
Current uses |
| |||||||||
Search for values |
List of violations of this constraint: Database reports/Constraint violations/P233#Type Q113145171, Q11344, Q25276, Q170409, Q47154513, Q15711994, SPARQL
[A-Za-z0-9+\-\*=#$:().>/\\\[\]%]+
”: value must be formatted using this pattern (PCRE syntax). (Help)List of violations of this constraint: Database reports/Constraint violations/P233#Entity types
List of violations of this constraint: Database reports/Constraint violations/P233#Scope, SPARQL
List of violations of this constraint: Database reports/Constraint violations/P233#single best value, SPARQL
Pattern ^(.*@.*)$ will be automatically replaced to \1 and moved to isomeric SMILES (P2017) property. Testing: TODO list |
|
format constraint
[edit]added a format constraint, that just checks if only valid characters are used --Akkakk 00:28, 26 June 2013 (UTC)
Multiples SMILES
[edit]Hi. It seems (in the WPEN chemboxes) that a few chemical components can have multiples SMILES codes. Do we have a solution in WD to represent all of them (I see only one property "SMILES"). Kelson (talk) 14:28, 24 April 2015 (UTC)
- @Kelson: You can have only one SMILES per chemical compound but depending your chemical, you can have two notations: canonical and isomeric. Canonical is the normal representation without indications about chiral carbons. This representation is not unique. The isomeric representation is unique. So normally you canonical for simple molecules and isomeric for chiral molecules. If you have more than two there is an error or the article is about a mixture or mix different data in the same article.
- I open a discussion in the Chemistry project. See here if you want to add your comment. Snipre (talk) 13:57, 17 June 2015 (UTC)
Canonicalization method is ambiguous
[edit]Current definition of the property does not specify which canonicalization method should be used. As is written on SMILES Wikipedia page, canonicalization output depends on algorithm/software, thus without properly specifying it there is risk of several alternative canonicalization methods being used for values, resulting in reduction of overall quality of data for this property. Ungurinis (talk) 06:55, 8 October 2021 (UTC)
I wonder if this is the same canonical SMILES as described in Towards a Universal SMILES representation - A standard method to generate canonical SMILES based on the InChI (Q28133319). Pinging Snipre who created the property according to property's history. Ungurinis (talk) 10:53, 10 January 2022 (UTC)
URL format
[edit]formatter URL (P1630) value added for this property is used for every SMILES statement throughout the Wikidata. While SMILES → structural formula is quite useful for Wikidata purposes (e.g. verification of stereochemistry, easy way to get a formula and add proper chemical classes), 3D molecule depiction is useless, it may serve only decorative function. I oppose any changes to the default URL format without prior discussion (with notification of WikiProject Chemistry members). Wostr (talk) 22:51, 29 September 2023 (UTC)
- All Properties
- Properties with string-datatype
- Properties used on 1000000+ items
- Properties with constraints on type
- Properties with format constraints
- Properties with qualifiers constraints
- Properties with entity type constraints
- Properties with scope constraints
- Properties with single best value constraints
- Chemical properties
- Medical properties