Use user defined annotations and trajectory trees.

CAPITAL can use annotations and trajectory trees that are calculated in other methods.

[1]:
import capital as cp
import scanpy as sc
import networkx as nx
import pandas as pd
import numpy as np

In this tutorial, we will use the same datasets as previous tutorial.

[2]:
adata1 = cp.dataset.setty19("../data/capital_dataset/setty19_capital.h5ad")
Downloading the dataset.
Download completed. The dataset is saved in ../data/capital_dataset/setty19_capital.h5ad
[3]:
sc.pl.umap(adata1, color="leiden")
../_images/tutorials_capital_tutorial_adding_annotations_trajectory_tree_5_0.png

Annotate the clusters.

Define annotations of clusters from the gene expressions.

[4]:
annotation1 = {'4': 'HSC', '7': 'HSC',
        '6': 'MPP',
        '13': 'MEP', '1': 'MEP', '5': 'MEP',
        '22': 'Mega',
        '11': 'Ery',
        '3': 'GMP1','10': 'GMP1', '12': 'GMP1', '21': 'GMP1',
        '9': 'GMP2', '15': 'GMP2',
        '14': 'CLP', '19': 'CLP', '20': 'CLP', '2': 'CLP',
        '18': 'DC', '17': 'DC', '16': 'DC',
        '0': 'Mono', '8': 'Mono'
    }

Add new annotations to the anndata.obs as “new_cluster”.

[5]:
adata1.obs["new_cluster"] = adata1.obs["leiden"]
adata1.obs["new_cluster"] = adata1.obs["new_cluster"].astype(object).replace(annotation1).astype('category')
[6]:
adata1.obs["new_cluster"]
[6]:
index
Run4_120703408880541     MEP
Run4_120703409056541     HSC
Run4_120703409580963    Mono
Run4_120703423990708    Mono
Run4_120703424252854     CLP
                        ...
Run5_241114589051630     Ery
Run5_241114589051819    GMP1
Run5_241114589128940     MEP
Run5_241114589357942     Ery
Run5_241114589841822     CLP
Name: new_cluster, Length: 5780, dtype: category
Categories (10, object): ['CLP', 'DC', 'Ery', 'GMP1', ..., 'MEP', 'MPP', 'Mega', 'Mono']
[7]:
adata1.obs["new_cluster"].cat.categories
[7]:
Index(['CLP', 'DC', 'Ery', 'GMP1', 'GMP2', 'HSC', 'MEP', 'MPP', 'Mega',
       'Mono'],
      dtype='object')
[8]:
sc.pl.umap(adata1, color="new_cluster")
../_images/tutorials_capital_tutorial_adding_annotations_trajectory_tree_12_0.png

Importing annotations from numpy array

If you have the annotations as Numpy array, you can simply add that data to anndata.obs

[9]:
# creating random annotation for demo. The numbers don't have any meanings.
ndarray_demo_annotation = np.random.randint(low=0, high=10, size= adata1.obs.shape[0]).astype(int)
ndarray_demo_annotation
[9]:
array([5, 3, 8, ..., 6, 5, 7])

Add the array to anndata.obs.

[10]:
adata1.obs["demo_annotation"] = ndarray_demo_annotation

Compute a trajectory tree

If you have annotations of the cells but not a trajectory tree,
ran cp.tl.trajectory_tree() as below and skip the next section(Drawing the trajectory tree).
Pass the name of the annotations to argument “groupby”.
[11]:
cp.tl.trajectory_tree(adata1, root_node="HSC", groupby="new_cluster")
[12]:
cp.pl.trajectory_tree(adata1)
../_images/tutorials_capital_tutorial_adding_annotations_trajectory_tree_19_0.png

Draw the trajectory tree from your anlysis or other methods.

CAPITAL accepts the trajectory tree as a directed graph from Networkx.
Tree can be written like below or converted from other data format like a adjacency matrix.
Please read networkx’s directed graph page for more infomation.
The name of nodes in the tree must match the annotation that you defined.
[13]:
tree1 = nx.DiGraph()
tree1.add_edges_from(
    [("HSC", "MPP"), ("MPP", "MEP"),("MEP","Mega"),("MEP","Ery"),
     ("MPP","CLP"),("MPP","GMP1"),("GMP1","GMP2"),("GMP2","DC"),("GMP2","Mono")]
)

Pass the AnnData object, the root name in the trajectory and the directed graph defined above.

[14]:
cp.tl.trajectory_tree(adata1, root_node="HSC", groupby="new_cluster", tree=tree1)

Draw the trajectory tree you defined.

[15]:
cp.pl.trajectory_tree(adata1)
../_images/tutorials_capital_tutorial_adding_annotations_trajectory_tree_25_0.png

Apply the same process to one or more datasets that you would like to align.

[16]:
adata2 = cp.dataset.velten17("../data/capital_dataset/velten17_capital.h5ad")
Downloading the dataset.
Download completed. The dataset is saved in ../data/capital_dataset/velten17_capital.h5ad
[17]:
annotation2 = {
    '0': 'HSC', '1': 'HSC',
    '17': 'MPP', '14': 'MPP',
    '2': 'Pre-B', '20': 'Pre-B', '15': 'Pre-B',
    '12': 'MEP', '19': 'MEP', '18': 'MEP',
    '6': 'Ery', '9': 'Ery',
    '13': 'Mega',
    '8': 'GMP', '5': 'GMP', '10': 'GMP', '21': 'GMP',
    '16': 'Neutro', '3': 'Neutro', '7': 'Neutro',
    '11': 'Mono/DC',
    '4': 'Eo/Baso/Mast'
}
[18]:
adata2.obs["new_cluster"] = adata2.obs["leiden"]
adata2.obs["new_cluster"] = adata2.obs["new_cluster"].astype(object).replace(annotation2).astype('category')
[19]:
tree2 = nx.DiGraph()
tree2.add_edges_from(
    [("HSC", "MPP"), ("MPP", "MEP"),("MEP","Mega"),("MEP","Ery"),
     ("MPP","Pre-B"),("MPP","GMP"),("GMP","Mono/DC"),("GMP","Neutro"),("GMP","Eo/Baso/Mast")]
)
[20]:
cp.tl.trajectory_tree(adata2, root_node="HSC", groupby="new_cluster", tree=tree2)
[21]:
cp.pl.trajectory_tree(adata2)
../_images/tutorials_capital_tutorial_adding_annotations_trajectory_tree_32_0.png

Aligning trajectory trees

[22]:
cdata = cp.tl.tree_alignment(adata1, adata2, num_genes1=2000, num_genes2=2000)
Calculating tree alignment
411 genes are used to calculate cost of tree alignment.

Calculation finished.

Draw the tree alignment result.

[23]:
cp.pl.tree_alignment(cdata)
../_images/tutorials_capital_tutorial_adding_annotations_trajectory_tree_36_0.png

See the list of the alignments name by alignmentlist.

[24]:
cdata.alignmentlist
[24]:
[('alignment000',
  ['HSC', 'MPP', 'GMP1', '#'],
  ['HSC', 'MPP', 'GMP', 'Eo/Baso/Mast']),
 ('alignment001',
  ['HSC', 'MPP', 'GMP1', 'GMP2', 'Mono'],
  ['HSC', 'MPP', 'GMP', '#', 'Neutro']),
 ('alignment002',
  ['HSC', 'MPP', 'GMP1', 'GMP2', 'DC'],
  ['HSC', 'MPP', 'GMP', '#', 'Mono/DC']),
 ('alignment003', ['HSC', 'MPP', 'MEP', 'Ery'], ['HSC', 'MPP', 'MEP', 'Ery']),
 ('alignment004',
  ['HSC', 'MPP', 'MEP', 'Mega'],
  ['HSC', 'MPP', 'MEP', 'Mega']),
 ('alignment005', ['HSC', 'MPP', 'CLP'], ['HSC', 'MPP', 'Pre-B'])]