● Public beta · v1

Qur'anic Arabic speech recognition, by the engineers who understand recitation.

The Hafiz API gives developers a single endpoint to transcribe any audio of Qur'anic recitation. Word-level timestamps, confidence scores, and reciter-quality accuracy. Built on a Quran-tuned Whisper, the same engine powering our live demo.

Live. Sign-up, key issuance, Stripe billing and the customer portal are all live at auth.hafiz-ai.com. Free tier ships 60 minutes of audio per month — no card required. Try the engine first on the live demo page, then sign up for a key when you're ready to integrate.

Hafiz API

A single REST endpoint that turns Qur'anic recitation audio into text with word-level timing and confidence. No model hosting, no GPU, no training. Just POST a wav file.

What you'll get

  • Speaker-independent accuracy on Qur'anic Arabic — built on tarteel-ai/whisper-base-ar-quran, verified to produce identical transcripts across professional reciters (Sudais, Husary) and unseen voices in our internal evals.
  • Word-level timestamps so you can build live highlight, scrubbing, or alignment-based grading.
  • Diacritic restoration — pass the target ayah and we return fully vowelised (Mushaf-style) Arabic.
  • Sub-second latency for short ayahs on our GPU cloud (85–270 ms per ayah on a consumer GPU; cloud will be faster).
  • Pay only for what you transcribe. Free tier covers 60 minutes/month — enough for hobby projects forever.
Live now. Every endpoint below is implemented and accepts real requests as soon as you create an account. Stripe billing, the customer portal, and webhook delivery are all wired up. Report anything off to hello@hafiz-ai.com.

Quickstart — first transcription in 60 seconds

Send a wav/mp3/m4a file to /v1/transcribe with your API key. Response comes back as JSON.

curl https://api.hafiz-ai.com/v1/transcribe \
  -H "Authorization: Bearer hafiz_sk_live_..." \
  -F "file=@recitation.wav" \
  -F "language=ar" \
  -F "return_timestamps=true"
import requests

resp = requests.post(
    "https://api.hafiz-ai.com/v1/transcribe",
    headers={"Authorization": "Bearer hafiz_sk_live_..."},
    files={"file": open("recitation.wav", "rb")},
    data={"language": "ar", "return_timestamps": True},
)
print(resp.json())
import fs from "node:fs";

const form = new FormData();
form.append("file", new Blob([fs.readFileSync("recitation.wav")]));
form.append("language", "ar");
form.append("return_timestamps", "true");

const resp = await fetch("https://api.hafiz-ai.com/v1/transcribe", {
  method: "POST",
  headers: { Authorization: "Bearer hafiz_sk_live_..." },
  body: form,
});
console.log(await resp.json());
var req = URLRequest(url: URL(string: "https://api.hafiz-ai.com/v1/transcribe")!)
req.httpMethod = "POST"
req.setValue("Bearer hafiz_sk_live_...", forHTTPHeaderField: "Authorization")

let boundary = "hafiz-\(UUID().uuidString)"
req.setValue("multipart/form-data; boundary=\(boundary)", forHTTPHeaderField: "Content-Type")
req.httpBody = makeMultipart(boundary: boundary, fileURL: recordingURL)

let (data, _) = try await URLSession.shared.data(for: req)
let result = try JSONDecoder().decode(TranscriptionResponse.self, from: data)

You'll get back something like:

{
  "text": "بسم الله الرحمن الرحيم",
  "language": "ar",
  "audio_seconds": 3.84,
  "inference_seconds": 0.71,
  "model": "whisper-quran-everyayah-5reciters-v1",
  "words": [
    { "word": "بسم", "start": 0.12, "end": 0.61, "confidence": 0.97 },
    { "word": "الله", "start": 0.62, "end": 1.18, "confidence": 0.99 },
    { "word": "الرحمن", "start": 1.21, "end": 2.02, "confidence": 0.98 },
    { "word": "الرحيم", "start": 2.05, "end": 3.79, "confidence": 0.96 }
  ]
}

Authentication

Every request must include an Authorization header with your secret key:

Authorization: Bearer hafiz_sk_live_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Keys come in two flavours:

  • hafiz_sk_test_… — sandbox, free, no charges, deterministic mocked transcripts for CI.
  • hafiz_sk_live_… — production, billed per second of audio.
Treat keys like passwords. Never commit them to a public repo or ship them in client-side iOS / Android / web bundles. Use a server-side proxy or our short-lived JWT flow (contact us).

Transcribe an audio file

Full transcription of a recording. Best for whole-ayah submissions or post-recitation analysis.

POST https://api.hafiz-ai.com/v1/transcribe

Multipart body

FieldTypeDescription
file requiredfileAudio file. Supported: wav, mp3, m4a, flac, ogg. Max 25 MB / 5 minutes.
language optionalstringDefaults to ar. Reserved for future multilingual variants.
return_timestamps optionalboolDefaults to true. Include word-level start/end timing.
target_ayah optionalstringThe target Mushaf ayah text. If supplied, the response includes a diacritized_text field with restored ḥarakāt aligned to your target.
max_new_tokens optionalintegerDefaults to 225. Range 32–448.
webhook_url optionalstringIf set, the API returns 202 Accepted immediately and POSTs the result to this URL when ready. Recommended for files >30s.

Example

curl -X POST https://api.hafiz-ai.com/v1/transcribe \
  -H "Authorization: Bearer $HAFIZ_KEY" \
  -F "file=@al-fatiha-1.wav" \
  -F "target_ayah=بِسْمِ ٱللَّهِ ٱلرَّحْمَٰنِ ٱلرَّحِيمِ"

Live partial transcription

Optimised for low-latency progressive transcription. Send 2–6 second chunks of audio every few seconds and get back what's been recited so far. This is the path Iqra's "Hafiz Live" mode will use.

POST https://api.hafiz-ai.com/v1/transcribe_partial

Same shape as /v1/transcribe but uses a smaller token budget (default 120) and skips post-processing for sub-300ms turnaround on most M-series Macs.

Tip: Send a rolling snapshot of your local recording every 2.5 seconds, and replace the displayed transcript each time. Don't try to stitch chunks — the model handles that.

Service health

GET https://api.hafiz-ai.com/v1/health

Returns { "status": "ok", "ready": true, "backend": "google-cloud-speech-v1", "model": "...", "language": "ar-SA" }. No auth required. Suitable for uptime monitors.

Dashboard & billing endpoints

These are the endpoints the developer dashboard at auth.hafiz-ai.com/dashboard.html uses. They require a Firebase ID token in the Authorization header (issued automatically on sign-in) rather than an API key — they aren't meant for direct programmatic use, but documenting them here in case you want to wire them into a custom dashboard.

MethodPathPurpose
POST/v1/bootstrapIdempotent: create / update your user record on first sign-in.
GET/v1/meReturns your plan, usage this period, and email.
POST/v1/keysMint a new API key (form field livemode=true|false). Plaintext returned once.
GET/v1/keysList the prefixes & metadata of all your keys.
DELETE/v1/keys/{id}Revoke a key by its hash id.
POST/v1/billing/checkoutStart a Stripe Checkout session. Body: plan, success_url, cancel_url.
POST/v1/billing/portalOpen the Stripe customer portal. Body: return_url.
POST/v1/billing/webhookStripe → Hafiz webhook receiver. Verifies Stripe-Signature.
Roadmap: A standalone /v1/align endpoint (per-word verdicts: correct, near_match, substituted, missed, extra) is on the Q3 2026 plan. Today this logic ships inside the iOS app — pass target_ayah to /v1/transcribe and use the per-word confidence values for grading server-side.

Response shape

FieldTypeDescription
textstringPlain Arabic transcript (no diacritics by default).
diacritized_textstring?Present only if target_ayah was supplied. Returns Mushaf-style fully-vowelised Arabic.
wordsarrayEach: { word, start, end, confidence }. Times in seconds.
languagestringDetected/forced language. Always "ar" in v1.
audio_secondsnumberLength of the input audio. This is what gets billed.
inference_secondsnumberWall-clock time we spent on the GPU.
modelstringVersioned model identifier — pin against this in production.
request_idstringSurface this in support tickets.

Errors

All errors return JSON of shape { "error": { "code": "...", "message": "..." } }.

HTTPCodeMeaning
400invalid_audioFile could not be decoded, or is > 25 MB / 5 min.
401missing_keyNo Authorization header.
401invalid_keyKey is malformed or revoked.
402quota_exceededFree tier consumed for the month, or paid card declined.
429rate_limitedToo many concurrent requests. Honor Retry-After.
500internalOur side. Auto-reported. Safe to retry once.
503queue_fullSpike capacity hit. Retry with exponential backoff.

Rate limits

Limits are per-key and reset every 60 seconds. Each response includes:

  • X-RateLimit-Limit — the cap.
  • X-RateLimit-Remaining — calls left in the window.
  • X-RateLimit-Reset — Unix epoch when the window resets.

See the pricing table for per-tier limits. Need more? Email hello@hafiz-ai.com.

Webhooks

For long-running batch transcriptions (set webhook_url in your request), we POST the result back as JSON with the same shape as a sync response, plus an X-Hafiz-Signature HMAC-SHA256 header you should verify with your webhook secret.

# Verifying a webhook in Python
import hmac, hashlib

def verify(body: bytes, sig: str, secret: str) -> bool:
    expected = hmac.new(secret.encode(), body, hashlib.sha256).hexdigest()
    return hmac.compare_digest(expected, sig)

Pricing

Pay only for the seconds of audio you transcribe. No minimums, no per-request fees, no surprise overage charges — we hard-stop at your monthly cap.

Free

Hobbyist

$0/month
  • 60 minutes / month of audio
  • 20 requests / minute
  • Word-level timestamps
  • Diacritic restoration
  • Sandbox keys included
  • Community support (GitHub)
Start free
Scale

Studio

$249/month
  • 250 hours / month included
  • Then $0.025 per audio-minute
  • 600 requests / minute
  • Dedicated model instance
  • Custom fine-tunes (quote)
  • Priority support · 12h SLA
Contact sales
Need an enterprise / on-prem deployment? We can ship the model as a Docker image to run inside your VPC or air-gapped network. Email hello@hafiz-ai.com.

Get your API key

Sign up at auth.hafiz-ai.com with Google, Apple, or email. You'll land on the dashboard with a test key already minted — paste it into the curl example above to verify everything works end-to-end. Live keys (billable) need a card on file, which you can add from the same dashboard.

The live /v1 surface today:

  • POST /v1/transcribe — billed, rate-limited, requires an API key.
  • POST /v1/transcribe_partial — same auth + billing, low-latency partial decoding for live UIs.
  • GET /v1/health — public, no auth, for uptime monitors.
  • Dashboard endpoints (/v1/keys, /v1/me, /v1/billing/*) — Firebase ID token, used by auth.hafiz-ai.com.

The surface continues to expand — subscribe to the release notes or email hello@hafiz-ai.com for updates.

If you're evaluating for high volume (> 250 hrs / month) and want a custom contract, email us and we'll set you up directly.

FAQ

Does the API support tajweed scoring?

Not yet — but pass target_ayah to /v1/transcribe and you get back per-word confidence values plus a diacritic-restored transcript. That's what Iqra's tajweed-aware grading is built on today. A standalone /v1/align endpoint and a dedicated tajweed scorer are both on the Q3 2026 roadmap.

Can I use this for non-Qur'anic Arabic?

Technically yes — the underlying model is a Whisper variant — but accuracy will be much worse than for Qur'anic text. Use OpenAI / Deepgram / Azure for general Arabic. Use us when the audio is recitation.

What's the maximum file size?

25 MB or 5 minutes per request, whichever comes first. For longer audio, slice it client-side or contact us about the batch endpoint.

Do you store my audio?

By default we keep audio for 24h to debug failures, then delete. You can opt out (X-Hafiz-No-Store: 1 header) and we'll process it in-memory only. See the privacy policy.

Is the model open-source?

The base model (tarteel-ai/whisper-base-ar-quran) is. Our fine-tuned weights are not — but we publish training recipes and evals on our research page.

Can I run this on-device?

Yes — the underlying model is standard Whisper, which converts cleanly to CoreML for iOS (via WhisperKit) and ONNX for Android. We'll publish first-class exports alongside the API launch. Until then, the API will give you the same accuracy with zero device cost.

Is the API stable?

The /v1/transcribe endpoint, authentication scheme, and pricing tiers are stable — we won't break them. We may add new fields to responses and new endpoints (batch, streaming, model variants); we won't remove or rename existing ones without a clear migration window.