Getting Started
Installation
pip install pymaftools
Quick Start
Read a MAF file and create an OncoPlot:
from pymaftools import MAF, OncoPlot
# Read MAF file
maf = MAF.read_maf("data/tcga_paad.maf")
# Create mutation table
table = maf.to_pivot_table()
# Plot top 20 mutated genes
(
OncoPlot(table)
.set_config(figsize=(12, 8))
.oncoplot()
.add_barplot()
.add_legend()
)
Subsetting Data
PivotTable.subset() lets you filter by features (rows) and samples (columns),
with metadata automatically kept in sync.
# By feature names
subset = table.subset(features=["TP53", "KRAS", "EGFR"])
# By boolean mask — select samples of a specific subtype
luad = table.subset(samples=table.sample_metadata["subtype"] == "LUAD")
# Combine both — specific genes in specific samples
result = table.subset(
features=table.feature_metadata["freq"] > 0.1,
samples=table.sample_metadata["subtype"] == "LUSC",
)
# Use with add_freq to compute group-wise mutation frequencies
table = table.add_freq(
groups={
"LUAD": table.subset(samples=table.sample_metadata.subtype == "LUAD"),
"LUSC": table.subset(samples=table.sample_metadata.subtype == "LUSC"),
}
)
Multi-omics Integration
from pymaftools import PivotTable, Cohort
# Build a cohort from multiple omics layers
cohort = Cohort()
cohort.add_table("mutation", mutation_table)
cohort.add_table("expression", expression_table)
cohort.add_table("cnv", cnv_table)
# Subset to shared samples
cohort = cohort.subset(samples=shared_samples)
For full API details, see the API Reference.