Human lymph node dataset

We analyzed the human lymph node dataset. Human Lymph Node can be obtained from 10x Genomics Data Repository (https://www.10xgenomics.com/cn/resources/datasets/human-lymph-node-1-standard-1-1-0). And GC can be obtained from GraphST.

1. Import packages

[1]:
from matplotlib import pyplot as plt
from DeepGFT.utils import *
from DeepGFT.genenet import obtain_genenet
from DeepGFT.train import *
import torch
import scanpy as sc
import warnings
from sklearn.metrics import jaccard_score, classification_report

warnings.filterwarnings('ignore')
os.environ['R_HOME'] = '/users/PCON0022/jxliu/anaconda3/envs/DeepGFT/lib/R'
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
seed_all(2023)
/users/PCON0022/jxliu/anaconda3/envs/DeepGFT/lib/python3.8/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm

2. Read data

[2]:
# Read data
adata = sc.read_visium('/fs/ess/PAS1475/Jixin/DeepGFT_proj/data/V1_Human_Lymph_Node', count_file='filtered_feature_bc_matrix.h5')
GC = pd.read_csv('/fs/ess/PAS1475/Jixin/DeepGFT_proj/data/V1_Human_Lymph_Node/manual_GC_annot.csv', index_col=0)
adata.obs['cell_type'] = GC['cell_type']

3. Data processing, including filtering genes and identifying svgs

[3]:
# Data preprocessing
adata.var_names_make_unique()
prefilter_genes(adata, min_cells=3)
adata, adata_raw = svg(adata, svg_method='gft_top', n_top=3000, smoothing=False)

4. Construct network and GFT

Construct spot-spot network and gene co-expression network, and the graph Fourier transform was performed respectively. Obtain signal features of spots and genes.

[4]:
# Build spotnet and genenet
obtain_spotnet(adata, knn_method='Radius', rad_cutoff=150, prune=False)
gene_freq_mtx, gene_eigvecs_T, gene_eigvals = f2s_gene(adata, gene_signal=1500, c1=0.1)
obtain_genenet(adata, dataset='pearson', species='human', cut=0.4)
spot_freq_mtx, spot_eigvecs_T, spot_eigvals = f2s_spot(adata, spot_signal=1500, middle=4, c2=0.001)
gene edges 6988 spots 4035

5. Train GAT module

[5]:
# Train GAT
res, lamda, emb_spot, _, attention = train_spot(adata, gene_freq_mtx, gene_eigvecs_T, spot_freq_mtx, spot_eigvecs_T,
                                                alpha=20, device=device, epoch_max=600, cluster=False)
spot*signal train
100%|██████████| 600/600 [00:21<00:00, 27.82it/s]

6. Downstream analysis

Spatial cluster

[6]:
# Cluster and plot
adata.obsm['emb'] = emb_spot
sc.pp.neighbors(adata, use_rep='emb')
sc.tl.leiden(adata, resolution=0.2)
clusters = adata.obs['leiden'].values.unique()
jac_max = 0
for i in clusters:
    jac = jaccard_score(adata.obs['cell_type'], (adata.obs['leiden'] == i).astype(int))
    if jac > jac_max:
        jac_max = jac
        cluster_GC = i
print('cluster:', clusters.shape[0], 'Jaccard:', jac_max)
cluster: 13 Jaccard: 0.7911832946635731
[7]:
index = classification_report(adata.obs['cell_type'].astype(int), (adata.obs['leiden'] == cluster_GC).astype(int))
print(index)
              precision    recall  f1-score   support

           0       0.99      0.99      0.99      3657
           1       0.87      0.90      0.88       378

    accuracy                           0.98      4035
   macro avg       0.93      0.94      0.94      4035
weighted avg       0.98      0.98      0.98      4035

[8]:
adata.uns['leiden_colors'] = ['#1f77b4', '#d62728', '#279e68', '#ff7f0e', '#ffff85', '#8c564b', '#e377c2',
                               '#b5bd61', '#17becf', '#aec7e8', '#ffbb78', '#98df8a', '#ff9896']
sc.pl.spatial(adata, color=["cell_type", "leiden"], title=['GC', 'DeepGFT (Jaccard=%.2f)' % jac_max], colorbar_loc=None)
... storing 'feature_types' as categorical
... storing 'genome' as categorical
../_images/notebook_5_Human_Lymph_Node_17_1.png

Gene denoise

[9]:
res_denoising, att = denoising(res, adata.uns['spotnet_adj'], attention[:, 0])
adata_res = adata.copy()
adata_res.X = res_denoising
[10]:
svg_list = ['CR2', 'SERPINE2', 'RGS13', 'STMN1', 'IGHG2', 'LINC00861', 'CCL20', 'IFI44L']
sc.pl.spatial(adata_res, color=svg_list)
../_images/notebook_5_Human_Lymph_Node_20_0.png
[ ]: