Keelhaul β€” Cactus INT8 Adapter Weights

Five fine-tuned multimodal LoRA adapters for on-device scam detection on Gemma 4 E2B, packaged for the Cactus on-device inference runtime.

This is the model-asset companion to the Keelhaul Android app β€” submission for the Gemma 4 Good Hackathon (Safety & Trust + Unsloth + LiteRT + Cactus tracks, deadline 2026-05-18).

Code: https://github.com/BrianIsaac/keelhaul

What is shipped

Tarball Modality Role Size INT8
keelhaul_text_triage_v2.tar.gz text Single-message scam / benign classification with phase + technique tagging ~4.7 GB compressed (5.6 GB extracted)
keelhaul_text_reasoner_v4.tar.gz text Trajectory-window reasoning: tier (safe / watching / concerned / alert) + phase + technique stack + grounded evidence: [{turn, role}] + 2-4-sentence reasoning ~4.7 GB compressed
keelhaul_vision_v2.tar.gz vision Scam-imagery classification: label + intent_class (e.g. fake_document, stolen_persona, crypto_scam_visual) + synthesis cues + region hint ~4.7 GB compressed
keelhaul_audio_v2.tar.gz audio Voice-scam classification: label + intent_class (e.g. authority_impersonation, account_takeover_demand) + vocal cues + weakest segment ~4.7 GB compressed
keelhaul_video_v6.tar.gz video Deepfake / synthetic-video detection on 4-keyframe + face-crop input; feeds into a runtime video composer alongside vision + audio ~4.7 GB compressed

Each tarball expands to a Cactus model directory: 1959 .weights files

  • config.txt + tokenizer.json + tokenizer_config.json.

Quantisation

INT8 via Cactus's CLI converter. Per-group symmetric quantisation, group size 32, FP16 scales. The multimodal projection layers and the synthetic post-projection norm are kept at FP16 by design (1674 FP16 tensors, 278 INT8-quantised tensors per model).

Quantisation quality (averaged across all 5 conversions):

Metric Value
MSE (mean) 3.04 Γ— 10⁻⁢
SNR (mean) 44.7 dB
Cosine similarity (mean) 0.999936

Why INT8 not INT4: on-device validation revealed the vision_v2 and video_v6 multimodal projection layers degraded reproducibly under INT4 weight quantisation (truncated JSON output, label-vs-intent_class inconsistency on held-out scam imagery). INT8 uses the same SMMLA hardware kernel path on Snapdragon 8 Gen 3 β€” same decode speed as INT4 β€” and restores near-lossless accuracy at ~2Γ— the on-disk size. The full migration narrative is in docs/submission/cactus_models_manifest.md in the code repo.

On-device performance (Honor Magic V3, Snapdragon 8 Gen 3)

  • ~70 prefill tokens/sec
  • ~5 decode tokens/sec (multimodal inputs are heavier than pure text)
  • ~2 s TTFT
  • ~3 GB resident per loaded model (28 GB on-disk for all 5; one model mmap'd at a time)

Training

All five adapters fine-tuned via Unsloth FastModel on Kaggle T4 Γ— 2 from unsloth/gemma-4-E2B-it. Held-out evaluation deltas (zero-shot base β†’ fine-tuned, 95 % bootstrap CI):

Modality n_eval Label / tier ZS β†’ FT Ξ”
text_triage_v2 563 F1 0.520 β†’ 0.757 +0.237 β˜…
vision_v2 270 F1 0.442 β†’ 0.976 +0.535 β˜…
audio_v2 140 F1 0.896 β†’ 0.951 +0.055 β˜…
reasoner_v4 186 tier acc 0.581 β†’ 0.930 +0.349 β˜…
video_v6 61 F1 0.000 β†’ 0.81 +0.81

13/13 metrics statistically significant at 95 % bootstrap CI. Full scorecards + per-class breakdowns under training/reports/v2/ in the code repo.

Usage on-device

The Keelhaul Android app downloads tarballs on first use of each modality and caches them at /data/data/com.keelhaul.detector/files/cactus_models/<id>/. From there cactus_init(dir.path, null, false) loads them into the Cactus runtime; the app's CactusPipeline.completeRaw hot-swaps modalities on demand.

Limitations

  • audio_v2 produced one observed false positive on a held-out benign LibriSpeech clip during on-device validation.
  • video_v6 exhibited a pair inversion on the specific real-vs-fake samples used during demo capture; the binary deepfake signal is correct in aggregate (held-out F1 0.81) but is the noisiest of the five adapters.

Licence

Apache-2.0. Base model unsloth/gemma-4-E2B-it is subject to Google's Gemma terms β€” please review before redistribution.

Citation

Keelhaul: on-device multimodal scam detection on Gemma 4 E2B
Brian Isaac, 2026
Gemma 4 Good Hackathon submission
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for BrianIsaac/keelhaul-cactus

Adapter
(26)
this model