Keelhaul — Cactus INT8 Adapter Weights

Five fine-tuned multimodal LoRA adapters for on-device scam detection on Gemma 4 E2B, packaged for the Cactus on-device inference runtime.

This is the model-asset companion to the Keelhaul Android app — submission for the Gemma 4 Good Hackathon (Safety & Trust + Unsloth + LiteRT + Cactus tracks, deadline 2026-05-18).

Code: https://github.com/BrianIsaac/keelhaul

What is shipped

Tarball	Modality	Role	Size INT8
`keelhaul_text_triage_v2.tar.gz`	text	Single-message scam / benign classification with phase + technique tagging	~4.7 GB compressed (5.6 GB extracted)
`keelhaul_text_reasoner_v4.tar.gz`	text	Trajectory-window reasoning: tier (safe / watching / concerned / alert) + phase + technique stack + grounded `evidence: [{turn, role}]` + 2-4-sentence reasoning	~4.7 GB compressed
`keelhaul_vision_v2.tar.gz`	vision	Scam-imagery classification: label + `intent_class` (e.g. `fake_document`, `stolen_persona`, `crypto_scam_visual`) + synthesis cues + region hint	~4.7 GB compressed
`keelhaul_audio_v2.tar.gz`	audio	Voice-scam classification: label + `intent_class` (e.g. `authority_impersonation`, `account_takeover_demand`) + vocal cues + weakest segment	~4.7 GB compressed
`keelhaul_video_v6.tar.gz`	video	Deepfake / synthetic-video detection on 4-keyframe + face-crop input; feeds into a runtime video composer alongside vision + audio	~4.7 GB compressed

Each tarball expands to a Cactus model directory: 1959 .weights files

config.txt + tokenizer.json + tokenizer_config.json.

Quantisation

INT8 via Cactus's CLI converter. Per-group symmetric quantisation, group size 32, FP16 scales. The multimodal projection layers and the synthetic post-projection norm are kept at FP16 by design (1674 FP16 tensors, 278 INT8-quantised tensors per model).

Quantisation quality (averaged across all 5 conversions):

Metric	Value
MSE (mean)	3.04 × 10⁻⁶
SNR (mean)	44.7 dB
Cosine similarity (mean)	0.999936

Why INT8 not INT4: on-device validation revealed the vision_v2 and video_v6 multimodal projection layers degraded reproducibly under INT4 weight quantisation (truncated JSON output, label-vs-intent_class inconsistency on held-out scam imagery). INT8 uses the same SMMLA hardware kernel path on Snapdragon 8 Gen 3 — same decode speed as INT4 — and restores near-lossless accuracy at ~2× the on-disk size. The full migration narrative is in docs/submission/cactus_models_manifest.md in the code repo.

On-device performance (Honor Magic V3, Snapdragon 8 Gen 3)

~70 prefill tokens/sec
~5 decode tokens/sec (multimodal inputs are heavier than pure text)
~2 s TTFT
~3 GB resident per loaded model (28 GB on-disk for all 5; one model mmap'd at a time)

Training

All five adapters fine-tuned via Unsloth FastModel on Kaggle T4 × 2 from unsloth/gemma-4-E2B-it. Held-out evaluation deltas (zero-shot base → fine-tuned, 95 % bootstrap CI):

Modality	n_eval	Label / tier ZS → FT	Δ
text_triage_v2	563	F1 0.520 → 0.757	+0.237 ★
vision_v2	270	F1 0.442 → 0.976	+0.535 ★
audio_v2	140	F1 0.896 → 0.951	+0.055 ★
reasoner_v4	186	tier acc 0.581 → 0.930	+0.349 ★
video_v6	61	F1 0.000 → 0.81	+0.81

13/13 metrics statistically significant at 95 % bootstrap CI. Full scorecards + per-class breakdowns under training/reports/v2/ in the code repo.

Usage on-device

The Keelhaul Android app downloads tarballs on first use of each modality and caches them at /data/data/com.keelhaul.detector/files/cactus_models/<id>/. From there cactus_init(dir.path, null, false) loads them into the Cactus runtime; the app's CactusPipeline.completeRaw hot-swaps modalities on demand.

Limitations

audio_v2 produced one observed false positive on a held-out benign LibriSpeech clip during on-device validation.
video_v6 exhibited a pair inversion on the specific real-vs-fake samples used during demo capture; the binary deepfake signal is correct in aggregate (held-out F1 0.81) but is the noisiest of the five adapters.

Licence

Apache-2.0. Base model unsloth/gemma-4-E2B-it is subject to Google's Gemma terms — please review before redistribution.

Citation

Keelhaul: on-device multimodal scam detection on Gemma 4 E2B
Brian Isaac, 2026
Gemma 4 Good Hackathon submission

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for BrianIsaac/keelhaul-cactus

Base model

google/gemma-4-E2B

Finetuned

google/gemma-4-E2B-it

Finetuned

unsloth/gemma-4-E2B-it

Adapter

(26)

this model