Keelhaul β Cactus INT8 Adapter Weights
Five fine-tuned multimodal LoRA adapters for on-device scam detection on Gemma 4 E2B, packaged for the Cactus on-device inference runtime.
This is the model-asset companion to the Keelhaul Android app β submission for the Gemma 4 Good Hackathon (Safety & Trust + Unsloth + LiteRT + Cactus tracks, deadline 2026-05-18).
Code: https://github.com/BrianIsaac/keelhaul
What is shipped
| Tarball | Modality | Role | Size INT8 |
|---|---|---|---|
keelhaul_text_triage_v2.tar.gz |
text | Single-message scam / benign classification with phase + technique tagging | ~4.7 GB compressed (5.6 GB extracted) |
keelhaul_text_reasoner_v4.tar.gz |
text | Trajectory-window reasoning: tier (safe / watching / concerned / alert) + phase + technique stack + grounded evidence: [{turn, role}] + 2-4-sentence reasoning |
~4.7 GB compressed |
keelhaul_vision_v2.tar.gz |
vision | Scam-imagery classification: label + intent_class (e.g. fake_document, stolen_persona, crypto_scam_visual) + synthesis cues + region hint |
~4.7 GB compressed |
keelhaul_audio_v2.tar.gz |
audio | Voice-scam classification: label + intent_class (e.g. authority_impersonation, account_takeover_demand) + vocal cues + weakest segment |
~4.7 GB compressed |
keelhaul_video_v6.tar.gz |
video | Deepfake / synthetic-video detection on 4-keyframe + face-crop input; feeds into a runtime video composer alongside vision + audio | ~4.7 GB compressed |
Each tarball expands to a Cactus model directory: 1959 .weights files
config.txt+tokenizer.json+tokenizer_config.json.
Quantisation
INT8 via Cactus's CLI converter. Per-group symmetric quantisation, group size 32, FP16 scales. The multimodal projection layers and the synthetic post-projection norm are kept at FP16 by design (1674 FP16 tensors, 278 INT8-quantised tensors per model).
Quantisation quality (averaged across all 5 conversions):
| Metric | Value |
|---|---|
| MSE (mean) | 3.04 Γ 10β»βΆ |
| SNR (mean) | 44.7 dB |
| Cosine similarity (mean) | 0.999936 |
Why INT8 not INT4: on-device validation revealed the vision_v2 and
video_v6 multimodal projection layers degraded reproducibly under INT4
weight quantisation (truncated JSON output, label-vs-intent_class
inconsistency on held-out scam imagery). INT8 uses the same SMMLA
hardware kernel path on Snapdragon 8 Gen 3 β same decode speed as INT4
β and restores near-lossless accuracy at ~2Γ the on-disk size. The full
migration narrative is in
docs/submission/cactus_models_manifest.md
in the code repo.
On-device performance (Honor Magic V3, Snapdragon 8 Gen 3)
- ~70 prefill tokens/sec
- ~5 decode tokens/sec (multimodal inputs are heavier than pure text)
- ~2 s TTFT
- ~3 GB resident per loaded model (28 GB on-disk for all 5; one model mmap'd at a time)
Training
All five adapters fine-tuned via Unsloth FastModel on Kaggle T4 Γ 2
from unsloth/gemma-4-E2B-it. Held-out evaluation deltas (zero-shot
base β fine-tuned, 95 % bootstrap CI):
| Modality | n_eval | Label / tier ZS β FT | Ξ |
|---|---|---|---|
| text_triage_v2 | 563 | F1 0.520 β 0.757 | +0.237 β |
| vision_v2 | 270 | F1 0.442 β 0.976 | +0.535 β |
| audio_v2 | 140 | F1 0.896 β 0.951 | +0.055 β |
| reasoner_v4 | 186 | tier acc 0.581 β 0.930 | +0.349 β |
| video_v6 | 61 | F1 0.000 β 0.81 | +0.81 |
13/13 metrics statistically significant at 95 % bootstrap CI. Full
scorecards + per-class breakdowns under
training/reports/v2/
in the code repo.
Usage on-device
The Keelhaul Android app downloads tarballs on first use of each
modality and caches them at
/data/data/com.keelhaul.detector/files/cactus_models/<id>/. From there
cactus_init(dir.path, null, false) loads them into the Cactus runtime;
the app's CactusPipeline.completeRaw hot-swaps modalities on demand.
Limitations
- audio_v2 produced one observed false positive on a held-out benign LibriSpeech clip during on-device validation.
- video_v6 exhibited a pair inversion on the specific real-vs-fake samples used during demo capture; the binary deepfake signal is correct in aggregate (held-out F1 0.81) but is the noisiest of the five adapters.
Licence
Apache-2.0. Base model
unsloth/gemma-4-E2B-it
is subject to Google's Gemma terms β please review before redistribution.
Citation
Keelhaul: on-device multimodal scam detection on Gemma 4 E2B
Brian Isaac, 2026
Gemma 4 Good Hackathon submission