# Accessing PubChem through PUG-REST: Part II

```{dropdown} About this interactive ![icons](../static/img/rocket.png) recipe
- Author(s): [Sunghwan Kim](https://orcid.org/0000-0001-9828-2074)
- Reviewer: [Samuel Munday](https://orcid.org/0000-0001-5404-6934)
- Topic(s): How to programmatically retrieve chemical data and chemical names using a structure.
- Format: Interactive Jupyter Notebook (Python)
- Scenario: You need to access and potentially download chemical data.
- Skills: You should be familar with:
    - [Application Programming Interfaces (APIs)](https://www.ibm.com/topics/api)
    - [Introductory Python](https://www.youtube.com/watch?v=kqtD5dpn9C8)
    - [Identifiers for Chemical Substances](https://chem.libretexts.org/Courses/University_of_Arkansas_Little_Rock/ChemInformatics_(2015)%3A_Chem_4399_5399/Text/5_Chemical_Identifiers)
- Learning outcomes:
    - How to access chemical data using a chemical name
    - How to chemical names for a chemical structure
- Citation: 'Accessing PubChem through PUG-REST - Part II', Sunghwan Kim, The IUPAC FAIR Chemistry Cookbook, Contributed: 2023-02-28 [https://w3id.org/ifcc/IFCC007](https://w3id.org/ifcc/IFCC007).
- Reuse: This notebook is made available under a [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/) license.
```

In [None]:
import requests
import time
from IPython.display import Image, display

## 1. Accessing PubChem data using a chemical name

You can access PubChem data using a chemical name.

In [None]:
print("CID             :", requests.get("https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/name/aspirin/cids/TXT").text.strip())
print("MolecularFormula:", requests.get("https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/name/aspirin/property/MolecularFormula/TXT").text.strip())
print("MolecularWeight :", requests.get("https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/name/aspirin/property/MolecularWeight/TXT").text.strip())
print("IsomericSMILES  :", requests.get("https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/name/aspirin/property/IsomericSMILES/TXT").text.strip())

It is very important to understand that, in the real world, people use a chemical name to refer to different, often closely related, chemicals.  An example is a drug name, as shown in this example.

In [None]:
cids1 = requests.get("https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/name/aleve/cids/TXT").text.strip().split()
print(cids1)

In [None]:
for mycid in cids1:
    display(Image(requests.get("https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/cid/" + mycid + "/record/PNG?image_size=200x200").content))
    print("CID " + mycid, ":", requests.get("https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/cid/" + mycid + "/property/MolecularFormula/TXT").text)
    time.sleep(0.2)

"**Aleve**" is the name of a drug product, whose active ingredient is **naproxen sodium** (CID 23681059).  When dissolved in water, naproxen sodium dissociates into a **naproxen anion** (CID 6925666), which inhibits with the drug target, and a sodium atom.  Note that the **naproxen anion** is the conjugate base of **naproxen** (CID 156391). While the three CIDs are structurally different, they are all freqently called "Aleve".

It is also possible to get any PubChem compounds records whose name contains the string "aleve" (i.e., **partial matching** with the input name).

In [None]:
cids2 = requests.get("https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/name/aleve/cids/TXT?name_type=word").text.strip().split()
print(cids2)

In [None]:
cids3 = [ x for x in cids2 if x not in cids1]   # remove the hits from the exact match.

for mycid in cids3:
    display(Image(requests.get("https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/cid/" + mycid + "/record/PNG?image_size=200x200").content))
    print("CID " + mycid, ":", requests.get("https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/cid/" + mycid + "/property/MolecularFormula/TXT").text)
    time.sleep(0.2)

## 2. Getting chemical names for a given chemical structure

In the above example, the two CIDs from partial matching (CIDs 56841568 and CIDs 77098502) are the mixture of naproxen with other compounds.  Because they are from partial matching with the name "aleve", we know their names contain "aleve", but what are they called exactly?  We can figure this out by getting all synonyms for these compounds using PUG-REST.

In [None]:
for mycid in cids3:
    print("#-- CID", mycid)
    print(requests.get("https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/cid/" + mycid + "/synonyms/TXT").text)
    time.sleep(0.2)

CID 56841568 is "Aleve-D Sinus & Cold" and CID 77098502 is "Aleve PM"