# Computing InChIs

```{dropdown} About this interactive ![icons](../static/img/rocket.png) recipe
- Author: [Vincent Scalfani](https://orcid.org/0000-0002-7363-531X)
- Reviewer: [Stuart Chalk](https://orcid.org/0000-0002-0703-7776)
- Topics: How to Calculate InChIs from SMILES, Using [RDKit](https://www.rdkit.org/) or [Open Babel](https://openbabel.org)
  *Adapted from CPCDS 2021 Digital IUPAC Session - 51st IUPAC General Assembly*
- Format: Interactive Jupyter Notebook (Python)
- Scenarios: You need to convert a SMILES string into its equivalent InChI string.
- Skills: You should be familiar with
    - [Chemical Identifiers](https://chem.libretexts.org/Courses/University_of_Arkansas_Little_Rock/ChemInformatics_(2015)%3A_Chem_4399_5399/Text/5_Chemical_Identifiers)
- Learning outcomes:  After completing this example you should understand:
    -  How to load and use RDKit to obtain and display chemical identifiers
    -  How to load and use Open Babel to obtain and display chemical identifiers
- Citation: 'Computing InChIs', Vincent Scalfani, The IUPAC FAIR Chemistry Cookbook, Contributed: 2024-02-14 [https://w3id.org/ifcc/IFCC012](https://w3id.org/ifcc/IFCC012).
- Reuse: This notebook is made available under a [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/) license.
```

## 1. Using RDKit
### 1.1 Import RDKit Modules

In [1]:
from rdkit import Chem
from rdkit.Chem import Draw

### 1.2 Create a Molecular Object from SMILES

In [None]:
# PubChem CID: 134601
m = Chem.MolFromSmiles('COC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](CC(=O)O)N')
m # to show image of molecule

In [None]:
# Internally, we have created an RDKit molecular object
print(m)

### 1.3 Calculate InChI

In [None]:
# Compute InChI from RDKit mol
Chem.MolToInchi(m)

In [None]:
# Compute InChIKey from RDKit mol
Chem.MolToInchiKey(m)

### 1.4 Calculate InChIs for a List of Molecules

In [None]:
# Import a file of SMILES strings
smiles_list = []
with open('../files/my_smiles.smi') as infile:
     for smi in infile:
            smiles_list.append(smi.rstrip()) # rstrip removes newline
print(smiles_list)

In [None]:
# Or create a list directly
smiles_list = ['COC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](CC(=O)O)N',
               'COC(=O)[C@@H](CC1=CC=CC=C1)NC(=O)[C@@H](CC(=O)O)N',
               'COC(=O)[C@H](CC1=CC=CC=C1)NC(=O)C[C@@H](C(=O)O)N',
               'C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC=O',
               'C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N',
               'CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)C']

In [None]:
# Next, loop through the smiles_list and create RDKit molecular objects
mols = []
for smi in smiles_list:
    mols.append(Chem.MolFromSmiles(smi))
    
print(mols)
# alternative solution
# mols = [Chem.MolFromSmiles(smi) for smi in smiles_list]

In [None]:
# Display the molecules in a grid
# SVG False uses PNG
Draw.MolsToGridImage(mols, molsPerRow=3, useSVG=False)

In [None]:
# Loop through mols (molecular objects) and calculate InChIs
InChIs = [Chem.MolToInchi(mol) for mol in mols]
print(InChIs)

## 2. Using Open Babel
### 2.1 Import Open Babel Modules

In [None]:
# Open Babel v3.1.1
from openbabel import pybel

### 2.2 Create a Molecular Object from SMILES

In [None]:
m = pybel.readstring("smi", "COC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](CC(=O)O)N")
m # to show image of molecule

In [None]:
# Internally, we have created an Open Babel molecular object
print(type(m))

### 2.3 Calculate InChI

In [None]:
# Set up InChI conversion
conv = pybel.ob.OBConversion()
conv.SetOutFormat("inchi")

In [None]:
# Calculate InChI
inchi_output = conv.WriteString(m.OBMol)
print(inchi_output)

In [None]:
# Set up InChIKey conversion
conv = pybel.ob.OBConversion()
conv.SetOutFormat("inchikey")

In [None]:
# Calculate InChIKey
inchikey_output = conv.WriteString(m.OBMol)
print(inchikey_output)

### 2.4 Calculate InChIs for a List of Molecules

In [None]:
# Import a file of SMILES
smiles_list =[]
with open('../files/my_smiles.smi') as infile:
     for smi in infile:
            smiles_list.append(smi.rstrip()) # rstrip removes newline
print(smiles_list)

In [None]:
# Next,loop through the smiles_list and create OB molecular objects
ms = [pybel.readstring("smi", m) for m in smiles_list]
print(ms)

In [None]:
# Set up InChI conversion
conv = pybel.ob.OBConversion()
conv.SetOutFormat("inchi")

# Loop through mols (molecular objects) and calculate InChIs
InChIs = [conv.WriteString(m.OBMol).rstrip() for m in ms]
print(InChIs)

**References**

[1] RDKit Documentation: https://www.rdkit.org/docs/index.html

[2] Open Babel Python Documentation: https://open-babel.readthedocs.io/en/latest/UseTheLibrary/Python.html