PHAROST API Reference

Public API surface of the pharost package. Import as:

import pharost

The package exposes three layers:

  • Top-level (pharost.*): training, inference, model loading.

  • pharost.analysis.*: downstream analyses on predicted drug-response scores.

  • pharost.model.*: lower-level neural network components (for advanced use).


Model Modules

pharost.train

pharost.train(
    p_bulk_gene_exp,
    p_bulk_label,
    p_adata,
    out_dir,
    n_epochs,
    batch_size,
    seed=42,
    device='cuda',
    LR=5e-5,
    GAT_hidden_dim=(512, 64),
    lmmd_weight=0.3,
    coral_weight=0.7,
    spatial_preprocess=False,
    p_cell_info=None,
    p_sc_label=None,
)

End-to-end training. Loads bulk + spatial data, trains the GAT-MLP transfer model with LMMD + CORAL domain-adaptation losses, and saves predicted probabilities to out_dir.

Parameters

Parameter

Type

Description

p_bulk_gene_exp

str

Path to bulk RNA-Seq expression CSV (cells × genes).

p_bulk_label

str

Path to bulk drug-response label CSV.

p_adata

str

Path to spatial AnnData (.h5 or .h5ad).

out_dir

str

Output directory; predictions and per-run log written here.

n_epochs

int

Number of training epochs.

batch_size

int

Number of METIS partitions for the spatial graph.

seed

int

default : 42, Random seed for full reproducibility.

device

str

'cuda' or 'cpu'.

LR

float

default : 5e-5,AdamW learning rate.

GAT_hidden_dim

tuple[int, int]

default: (512, 64), GAT hidden dimensions (num_hidden, out_dim).

lmmd_weight

float

default: 0.3, LMMD loss weight.

coral_weight

float

default: 0.7, CORAL loss weight.

spatial_preprocess

bool

If True, run scanpy normalize/log/scale on spatial data.

p_cell_info

str, optional

CSV with x_centroid/y_centroid columns; used if adata.obsm['spatial'] is missing.

p_sc_label

str, optional

scRNA-seq label path (passed to inner trainer).

Returns: trained TransferNN model.

Outputs: writes the following to out_dir:

  • pharost_weights_adapted.pth — pickled trained model.

  • predicted_probabilities.csv — spatial-cell predicted probabilities.

  • predicted_probabilities_bulk.csv — bulk predicted probabilities.


pharost.load_model

pharost.load_model(path, device='cuda')

Load a saved PHAROST model from disk.

Parameters

Parameter

Type

Description

path

str

Path to a saved .pth file produced by pharost.train().

device

str

Device to map the model onto.

Returns: TransferNN model.


pharost.predict

pharost.predict(
    model,
    adata,
    batch_size,
    device='cuda',
    spatial_preprocess=False,
    spa_identity=False,
)

Run inference on spatial cells using a trained model.

Parameters

Parameter

Type

Description

model

TransferNN

Model from pharost.train() or pharost.load_model().

adata

AnnData

Spatial transcriptomics object with adata.obsm['spatial'].

batch_size

int

Number of METIS partitions, same as in the training.

device

str

'cuda' or 'cpu'.

spatial_preprocess

bool

Apply scanpy preprocessing before inference.

spa_identity

bool

Use identity matrix for spatial graph.

Returns: list[np.ndarray] — per-cell predicted probability vectors, ordered to match adata.obs.


Analysis Modules pharost.analysis

Downstream analyses. All functions assume adata.obs[drug] already contains predicted probabilities (use load_response_prediction to populate them).

pharost.analysis.load_response_prediction

pharost.analysis.load_response_prediction(
    adata,
    drugs,
    path_template,
    add_label=False,
    label_threshold=0.5,
)

Load per-drug prediction CSVs into adata.obs.

Parameters

Parameter

Type

Description

adata

AnnData | str

AnnData object, or path to .h5ad file.

drugs

list[str]

Drug names to load.

path_template

str | callable

Per-drug path: format string with {drug} placeholder, or callable(drug) -> str.

add_label

bool

If True, also write adata.obs[f"{drug}_label"] as "Sensitive"/"Resistant".

label_threshold

float

Threshold for the Sensitive/Resistant cutoff.

Returns: AnnData (in-place modified, also returned).

Example

adata = pharost.analysis.load_response_prediction(
    adata,
    drugs=['LAPATINIB', 'AFATINIB'],
    path_template=lambda d: f'BC_result/{d}/predicted_probabilities.csv',
)

pharost.analysis.plot_response_celltype_prop

pharost.analysis.plot_response_celltype_prop(
    adata,
    target_drugs,
    cell_type_col,
    save=False,
    file_format='pdf',
    sample_id=None,
    save_dir='Drug_Celltype',
    palette='tab10',
)

Bar plot of the proportion of “sensitive” cells (probability > 0.5) in each cell type, faceted by drug. Cell types with fewer than 5 cells are dropped.


pharost.analysis.drug_gene_correlation

pharost.analysis.drug_gene_correlation(
    adata,
    target_drugs,
    n_top_genes=15,
    cmap=None,
    plot=True,
    annot=False,
    save=False,
    save_dir='Gene_Drug_Corr',
    file_format='png',
    verbose=False,
)

Spearman correlation between gene expression and drug response across all cells. Computes per-drug top-n_top_genes and plots a clustered heatmap of the union.

Returns: long-form pd.DataFrame with columns Gene, Drug, Correlation.