You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

EXAONE Path 2.0 rev EGFR

EGFR Viewer Demo

Introduction

EXAONE Path 2.0 rev EGFR predicts EGFR mutation status from Whole Slide Images (WSIs) using a four-stage pipeline:

Patchify (§3.1): Segment tissue regions from the WSI and extract patch coordinates.
Patch Feature Extraction (§3.2): Encode each patch into a feature vector using the pretrained patch encoder, producing per-slide HDF5 feature files.
Slide Feature Extraction (§3.3): Aggregate patch-level features into a single slide-level representation using the slide encoder.
EGFR Classification (§3.4): Feed the slide representation into the classifier head to output an EGFR mutation probability score.

Quickstart

Load EXAONE Path 2.0 rev EGFR and extract features.

1. Hardware Requirements

NVIDIA GPU with 12GB+ VRAM
NVIDIA driver version >= 525.60.13 required

Note: This implementation requires NVIDIA GPU and drivers. The provided environment setup uses CUDA-enabled PyTorch, so NVIDIA GPU is mandatory for running the model.

2. Environment Setup

Install Micromamba first if needed (installation guide), then create and activate the environment:

micromamba create -n exaone-path-egfr python=3.12
micromamba activate exaone-path-egfr
pip install -U "huggingface_hub[cli]"
hf download LGAI-EXAONE/EXAONE-Path-2.0-rev-EGFR --repo-type model --local-dir exaone-path-2.0-rev-egfr
cd exaone-path-2.0-rev-egfr
pip install -r requirements.txt

3. Inference Workflow Overview

Run the following stages in order.

3.1. Patchify

Generate patch coordinates with the Python API patchfy_batch:

segment tissue regions
extract patch coordinates
save an HDF5 file with coords and contour_index (save_h5=True; required for Steps 3.2-3.4)

from exaonepath.patches import patchfy_batch

data_folder = "path/to/your/wsi_folder"
out_dir = "path/to/output_dir"

results = patchfy_batch(
    data_folder=data_folder,  # scans common WSI formats
    out=f"{out_dir}/patches",
    num_workers=8,  # number of CPU cores to use
    csv_path=f"{out_dir}/patches/logs/patchfy_batch_status.csv",
    patch_size=256,
    step_size=256,
    patch_level=0,
    save_h5=True,
    save_mask=True,
)

Outputs

Patch coordinate HDF5: <out_dir>/patches/patches/<slide_id>.h5
Status CSV: <out_dir>/patches/logs/patchfy_batch_status.csv
(Optional) masks: <out_dir>/patches/masks/<slide_id>.jpg

3.2. Patch Feature Extraction

Extract patch features for all successful WSIs from the patchify status CSV:

python -m exaonepath.feature_extraction.extract_batch_patch_feature \
    --status_csv_path "<out_dir>/patches/logs/patchfy_batch_status.csv" \
    --coords_root_dir "<out_dir>/patches/patches" \
    --feature_out_dir "<out_dir>/patch_features" \
    --batch_size_per_gpu 1024 \
    --gpu_ids "" \
    --num_workers_per_gpu -1

Notes

--status_csv_path: patchfy_batch output status CSV.
--coords_root_dir: directory containing coordinate HDF5 files (<slide_id>.h5) from Step 3.1.
--feature_out_dir: output directory root for patch feature HDF5 files and logs.
--batch_size_per_gpu: batch size per GPU. Adjust based on available GPU memory.
--gpu_ids: comma-separated GPU indices to use. Default: "" (all visible GPUs).
--num_workers_per_gpu: DataLoader workers per GPU. Default: -1 (auto).
Output status CSV is written under <out_dir>/patch_features/logs/.

3.3. Slide Feature Extraction

Patch features, coordinates, and contour index must be available.

Batch extraction (recommended)

python -m exaonepath.feature_extraction.extract_batch_slide_feature \
    --status_csv_path "<out_dir>/patch_features/logs/extract_patch_feature_batch_status.csv" \
    --patch_feature_root_dir "<out_dir>/patch_features" \
    --slide_feature_out_dir "<out_dir>/slide_features" \
    --batch_size_per_gpu 32 \
    --gpu_ids "" \
    --num_workers_per_gpu -1

Notes

--status_csv_path: patch-feature extraction status CSV (Step 3.2 output).
--patch_feature_root_dir: directory containing patch feature files from Step 3.2.
--slide_feature_out_dir: output directory root for slide feature files.
--batch_size_per_gpu: batch size per GPU. Adjust based on available GPU memory.
--gpu_ids: comma-separated GPU indices to use. Default: "" (all visible GPUs).
--num_workers_per_gpu: DataLoader workers per GPU. Default: -1 (auto).
Output status CSV is always written to <out_dir>/slide_features/logs/extract_slide_feature_batch_status.csv.

3.4. EGFR Classification

Batch classification (recommended)

python -m exaonepath.feature_extraction.classify_batch_egfr \
    --slide_feature_status_csv_path "<out_dir>/slide_features/logs/extract_slide_feature_batch_status.csv" \
    --prediction_csv_path "<out_dir>/slide_features/logs/egfr_classification_results.csv" \
    --heatmap_out_dir "<out_dir>/heatmaps" \
    --gpu_ids ""

Batch classification with optional viewer artifacts

python -m exaonepath.feature_extraction.classify_batch_egfr \
    --slide_feature_status_csv_path "<out_dir>/slide_features/logs/extract_slide_feature_batch_status.csv" \
    --prediction_csv_path "<out_dir>/slide_features/logs/egfr_classification_results.csv" \
    --heatmap_out_dir "<out_dir>/heatmaps" \
    --viewer_out_dir "<out_dir>/viewer" \
    --gpu_ids ""

Notes

--slide_feature_status_csv_path: slide-feature extraction status CSV (Step 3.3 output), expected at <out_dir>/slide_features/logs/extract_slide_feature_batch_status.csv.
--prediction_csv_path: output CSV path for EGFR prediction results.
--gpu_ids: comma-separated GPU indices to use. Default: "" (all visible GPUs).
Prediction CSV columns: wsi_name, score.
score is class-1 probability (probs[1]), i.e., EGFR mutation score.

Heatmap options

--heatmap_out_dir: output directory for slide-level attention heatmaps. If omitted, defaults to a sibling heatmaps/ directory next to the slide-feature root.
--heatmap_max_side: maximum width/height in pixels for the saved heatmap image (default: 2048).
--heatmap_alpha: blend weight of the heatmap overlay over the slide thumbnail, 0=transparent, 1=fully opaque (default: 0.4).
--heatmap_blur: apply Gaussian blur to smooth patch boundaries; applied twice, once on the score map and once on the final RGB image (default: True).
--heatmap_blur_strength: blur strength multiplier relative to the auto-derived sigma. 1.0=default strength, <1.0=weaker, >1.0=stronger (default: 1.0).
--heatmap_cmap: matplotlib colormap name for the attention heatmap, e.g. coolwarm, jet, viridis (default: coolwarm).
--top_patches_n: also save a grid of the highest-attention patches alongside the heatmap (0 to disable).

Viewer artifact options

--viewer_out_dir: required to generate the viewer artifacts used by the zoomable EGFR viewer.

3.5. EGFR Viewer

Before running the viewer, generate viewer artifacts in Step 3.4 by setting --viewer_out_dir "<out_dir>/viewer".

Run the local viewer

python -m exaonepath.viewer.serve \
    --viewer_manifest_csv_path "<out_dir>/viewer/logs/egfr_viewer_manifest.csv" \
    --host "127.0.0.1" \
    --port 8000

Then open http://127.0.0.1:8000.

License

The model is licensed under EXAONEPath AI Model License Agreement 1.0 - NC

Contact

LG AI Research Technical Support: contact_us1@lgresearch.ai

Downloads last month: -

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for LGAI-EXAONE/EXAONE-Path-2.0-rev-EGFR

Enhancing Whole Slide Pathology Foundation Models through Stain Normalization

Paper • 2408.00380 • Published Aug 1, 2024 • 3