How it works

The audit, in plain English

This page describes exactly what the engine does. It's kept in sync with the live rules — if you spot a row whose behaviour disagrees with what you read here, that's a bug: click ⚠ Report on the row and we'll fix it.

Engine: v0.14.0 — May 8, 2026

1. The one question

For every line on every carrier bill: "Is this phone number in our Salesforce, and what should we do about it?"

Then the reverse: "For every active Salesforce contract, did it appear on any uploaded bill?"That second pass is the "Unbilled assets" bucket.

2. The nine buckets

Every billed line — and every active Salesforce contract — ends up in exactly one bucket. Listed here in the priority order they appear in the audit results page (most actionable first; healthy & ignored last).

🚨

Ghost devices

high

Hardware appears dead. Cancel + retrieve.

All these must be true

•Billed on the carrier bill
•Phone matches Salesforce active
•Bill shows zero or no usage
•InControl shows device offline 30+ days (configurable in Settings)
•Duplicate SIM ICCID on multiple devices: keep the online device, or the most recently online if all are offline

Action

Cancel the SIM at the carrier and retrieve the hardware.

➕

Add to Salesforce

high

Billed and confirmed working — but missing from CRM.

All these must be true

•Billed on the carrier bill
•Phone is NOT in Salesforce
•InControl confirms the device is ONLINE

Action

Add the Asset to Salesforce. We own it and we know it works.

🔍

Not on Salesforce

medium

Billed line, missing from CRM, no InControl confirmation.

All these must be true

•Billed on the carrier bill
•Phone is NOT in Salesforce
•InControl can't confirm (offline 60+ days OR no device record)

Action

Manual search (tickets / emails / order history) before canceling. If you find a record we own → add to SF; otherwise → cancel.

🪞

Unbilled assets (reverse audit)

medium

Active in Salesforce but NOT on any carrier bill.

All these must be true

•Salesforce status is Active or Deployed
•The phone (or ICCID, for Frontier) doesn't appear on any uploaded carrier bill

Action

Verify line status with the carrier. May be canceled at the carrier without CRM update — risky for failover.

🔄

SIM swaps & failover

medium

MTN/ICCID drift OR healthy failover SIMs.

All these must be true

•(swap) Phone matches SF, but bill ICCID differs from SF ICCID
•(missing-iccid) Phone matches SF, SF has NO ICCID on file → add the bill's ICCID to SF
•(failover) Zero usage but InControl shows device online

Action

Verify, then either update Salesforce SIM ID, add the ICCID to SF, or leave alone (failover is doing its job).

📦

No device attached

medium

Billed line in CRM but no hardware in InControl.

All these must be true

•Billed on the carrier bill
•Phone matches Salesforce active
•Bill shows zero or no usage
•InControl has NO device record for this SIM

Action

Confirm: SIM-only sale, cold spare, or on-site backup SIM? If none of those, treat as ghost candidate.

🤔

Manual audit needed

low

True edge cases — kept as a safety net.

All these must be true

•The rules engine couldn't fit this row into a clearer category
•(in practice currently rare; any future rule branch that can't decide routes here)

Action

Human review of bill + Salesforce + InControl context.

🟢

Working SIMs

high

Healthy. No action needed.

All these must be true

•Phone is in Salesforce active
•Carrier bill reports usage > 0 MB, OR
•Bill has no usage column AND InControl confirms the device is online

Action

Leave it alone.

🗂

Ignored

medium

Operator-approved phones. Hidden from their natural bucket — until they move.

All these must be true

•Operator clicked the Ignore checkbox on a row
•All of the phone's billed lines are still in the bucket the operator approved
•Auto-clears the second any of the phone's lines lands in a DIFFERENT bucket

Action

Nothing. The system has remembered your earlier review.

3. The decision tree

For every billed line the engine walks the forward audit; then for every Salesforce active contract it runs the reverse audit.

Forward audit (per billed line)

Bill says we're paying for phone X (or ICCID, if Frontier)

↓Is phone X in Salesforce active?

↓If NO — what does InControl say about the SIM?

Online → Add to Salesforce (we own it, working)

Offline 60+ days OR no record → Not on Salesforce (search before cancel)

↓If YES — does the bill show real usage (>0 MB)?

↓If YES — does ICCID match?

Match → Working SIMs

SF has different ICCID → SIM swaps & failover (verify swap)

SF has NO ICCID → SIM swaps & failover (add ICCID to SF)

↓If NO usage / unknown — what does InControl say?

Offline 30+ days → Ghost devices

Online (no ICCID drift) → Working SIMs (failover-friendly)

Online + ICCID drift → SIM swaps & failover

No device record → No device attached (SIM-only / cold spare?)

Reverse audit (per Salesforce active contract)

Salesforce says contract Y is active

↓Did Y appear on any uploaded carrier bill?

Yes → covered in the forward audit above

No → Unbilled assets (carrier may have canceled silently)

4. 📞 Phone number is the master key

The audit matches by phone for every carrier (one exception below — Frontier). Phones are normalized so all of these point to the same line:

(555) 123-4567+1 555-123-45671.555.123.45675551234567MTN: 555-123-4567sub: +1 (555) 123-45675551234567 ext 22555.123.4567 x100

Column names in carrier files come in many flavors — all of these are recognized as the phone column:MTN · MDN · MSISDN · sub · sub: · subscriber · PTN · Service_TN · CURR_SRV_ACCS_NBR · Wireless Number · Mobile Number · Phone · Line Number · Contact

5. 🟡 Frontier — the one exception

Frontier doesn't put real phone numbers on their bills. Their Service_TN column actually holds the SIM ICCID (a 19-digit number starting with 891030…).

So for Frontier — and only Frontier — we match by ICCID instead of by phone. Salesforce records for Frontier supplier rows must have the SIM ID populated for matching to work.

6. ⛔ InControl is verification only

The audit list comes from carrier bills. InControl is never a source of audit candidates — only a witness for SIMs already on a bill.

✅ Bill says X is billed → look up X in InControl → use status as evidence
✅ Salesforce says contract Y is active → check whether Y appears on any bill
❌ NEVER: take an InControl SIM and ask "is this in Salesforce?" — that direction is forbidden

7. ICCID verification (only when phone matches)

Once a phone is matched to Salesforce, we check whether the bill's ICCID matches what SF has. There are three states:

match

Bill ICCID is in SF's contract ICCIDs (or bill has no ICCID). No action needed.

missing_in_sf

Phone matches SF but SF has no ICCID on file. Add the bill's ICCID to the SF Asset.

mismatch_in_sf

SF has a different ICCID than the bill. Verify SIM swap with the carrier.

We never recommend ICCID actions when the phone isn't in Salesforce — first deal with the missing phone record; the SIM ID is downstream.

8. How files get into the engine

The upload screen has three explicit slots — the operator decides where each file goes. This eliminates the "wrong file in the wrong pile" failure mode that header-only classification can hit on edge cases.

1. Salesforce export *

Exactly one file. The source of truth — every billed line is matched against this.

2. InControl support files

Device list + SIM info. Used only to validate / enrich — never as a bill.

3. Carrier bills *

Verizon, T-Mobile, AT&T, Frontier, Data2Go, RH, etc. The data being audited.

Within each slot, the engine still inspects column headers to decide which extractor to run — the slot tells it what category of file this is, the headers tell it which carrier / sub-shape.

Salesforce export

Recognized by: Service Account · SIM ID · Supplier · MTN · Assets: Assets

InControl device list

Recognized by: S/N · Online Status · Online/Offline

InControl SIM info

Recognized by: SIM Slot · IMSI · ICCID

Verizon bill

Recognized by: Filename has 'Verizon' OR 'Quick Bill Summary' header

Frontier bill

Recognized by: Service_TN · Service_Circuit_Number · Service_Status

AT&T bill

Recognized by: BAN · FAN · Wireless Number columns OR filename

T-Mobile bill

Recognized by: MSISDN · Subscriber Number OR filename has 'T-Mobile' / 'T-Mo'

Data2Go bill

Recognized by: Filename has 'Data2Go' + 'Carrier Name' column (reseller — provider per row)

RH bill

Recognized by: Filename has 'RH' (AT&T reseller)

Generic carrier

Recognized by: Catch-all if a file has identifier + usage/amount columns

9. Confidence levels

Every row carries a confidence badge. Color matches what you see in the audit results.

high

Strong, unambiguous evidence. Multiple data sources agree. Act on it.

medium

Rule fired but evidence has known false-positive modes. Verify before acting.

low

Edge case the rules engine couldn't decide. Human review required.

10. 🤖 AI assist on edge cases

Everything above runs as deterministic JavaScript — no AI involved. AI only enters the picture for rows the rules engine couldn't confidently categorize (specifically: manualAuditNeeded rows or those flagged low confidence).

✅ Gemini sees the same evidence the rules engine saw and returns a refined bucket + plain-English narrative.
✅ Every AI verdict is cached in SQLite by a hash of the row's evidence — same input gives same output forever (deterministic).
✅ Off by default unless GEMINI_API_KEY is set in Settings AND the "Use AI on edge cases" toggle is on.
❌ AI is never used for the deterministic rules above. The bulk of the audit is byte-identical between runs even with AI off.

11. 💵 How leakage is calculated

The dollar number on the dashboard is computed from the categorized output — never AI-estimated.

Bucket	Counts toward leakage?	Fallback when amount missing
Ghost devices	Yes	$25 / mo
Add to Salesforce	Yes — we're paying for it	$25 / mo
Not on Salesforce	Yes	$25 / mo
Unbilled assets	No (not on a bill)	—
SIM swaps & failover	No (inventory issue)	—
No device attached	No (ambiguous)	—
Manual audit needed	No (no claim without justification)	—
Working SIMs	No (healthy)	—

12. ⚠ Spotted a row that's wrong?

Click ⚠ Report on any row in the audit results. A modal opens with the full row context. Describe what's wrong — we get an email with everything: the row's data, the engine's evidence, the recommendation it gave, and your description.

Then we tune the rule and add a regression test so the same false positive can never silently return. The engine learns from real use; this page updates to reflect the new behaviour. Every change to the rules updates this page in the same release.

13. 🚨 Salesforce-cancelled detection

When a row is matched to a Salesforce contract whose status reads Cancelled, Disconnected, Suspended, Terminated, Decommissioned, Inactive (or any close variant), it lands in Ghost devices with the highest possible confidence — no InControl heuristic needed. Salesforce is already telling us the line should be off; the only remaining action is the carrier-side cancellation.

The status vocabulary is intentionally broad to catch operator typos and carrier-specific words. "Cancelled", "cancelled ", "Disconnected", "Inactive" all map to the same canceled state.

Limitation worth knowing: this rule looks at the asset-level status. If your Salesforce export includes a separate parent Service Account status column (e.g. account-level cancellation), it isn't currently joined in. Add it as a column to the export and we'll wire it into the rule.

14. ☁︎ Direct Salesforce links

Every row whose ICCID/MTN matched a Salesforce record now shows an ☁︎ Open in SF chip under the SIM ICCID. Click it to open the live record in a new tab — no searching, no copy-paste.

The link is built from the Assets: ID column in your Salesforce export (values like a1ZHs00000OIVssMAH) and points at: metrowireless.lightning.force.com/lightning/r/Assets__c/<id>/view.

The asset ID is also included in CSV exports (column Salesforce Asset ID) and in the error-report email so anyone reviewing a flagged row can jump straight to the SF record.

Rows in the Not on Salesforce bucket don't have a SF link (by definition — they're missing from CRM). Rows in Unbilled assets (reverse audit) DO have links since they came from a SF record in the first place.

15. 🗂 Phone history & the Ignore mechanism

The system records every phone (or Frontier ICCID-only identity) it has ever seen across all audits. Each phone has a permanent record showing first-seen / last-seen, total audits it appeared in, the bucket it currently sits in, and a full timeline of every ignore, un-ignore, auto-clear, and per-audit categorization. Three SQLite tables back this: phone_records, phone_ignore_state, phone_history_log.

Each row has an Ignorecheckbox. Tick it to tell the engine "I've already reviewed this phone and approved it staying in this bucket." The phone instantly moves to the new Ignoredtab and won't surface in its current bucket on future audits.

Auto-clear safety net:the moment any of that phone's billed lines moves to a different bucket on a future audit (e.g. it shifts from Ghost devices to Unbilled assets), the ignore is automatically cleared and the row resurfaces. The system logs auto_cleared_bucket_changein the phone's history so the operator can see the transition.

Bulk operations. Select multiple rows with the row checkboxes, then use Ignore selected / Un-ignore selected in the bulk-action bar that appears at the top of the table. Every individual action is logged with timestamp + originating IP, even when triggered from a bulk request.

Per-phone history page. Click any phone number in the audit table to open /phones/<identity>. That page shows the phone's record summary (MTN, ICCID, first/last seen, last bucket, last carrier, total audits seen), its current ignore state if any, and a styled timeline of every event in chronological order with action labels (Ignored / Un-ignored / Auto-cleared / Categorized in audit), the bucket involved, the IP that triggered the action, and a clickable link back to the audit run that produced each timeline entry.

Identity scheme: phone-matched rows use the normalized 10-digit MTN as their identity. Frontier rows (which have only ICCID, no phone) use iccid:<digits> instead. Both can be ignored.

API surface: POST/DELETE /api/phones/[identity]/ignore, POST /api/phones/bulk-ignore, POST /api/phones/bulk-unignore, GET /api/phones/[identity]/history.

16. 📊 Per-audit dashboard

Every audit results page now opens with a strip of widgets above the bucket tabs:

Headline KPIs — total categorized lines, leakage USD, ignored phone count, distinct carriers, source-file count.
Bucket distribution — horizontal bar per bucket, sorted by count.
By carrier — counts grouped by detected carrier (Verizon, T-Mobile, AT&T, Frontier, Data2Go, RH, etc.).
By source file — counts grouped by the bill / file each row came from.
Carrier × bucket cross-tab — a small grid showing which carrier dominates which bucket, so e.g. you can see at a glance that all your ghost devices are Verizon.

Pure CSS/SVG bars — no chart library. The whole dashboard is server-rendered, so it appears instantly when the audit page loads.

17. 📄 PDF extraction (and why we don't cache)

Carrier PDFs (Verizon Bill Summary, RH, T-Mobile, etc.) go through a two-stage extractor:

pdf-parse (free, fast) reads the text layer of the PDF and runs regex over it for ICCIDs, phones, and prices. OCR character substitutions (O→0, |→1) and zero-width Unicode strip happen here only.
If pdf-parse returns < 1 row AND Document AI is configured (Settings → Document AI), the PDF is split into 15-page chunks and sent to DocAI for proper table extraction. This is paid per page; the first stage is what we lean on for most bills.
If both come up empty, the file is reported as ⚠ no rows extracted in the activity log AND in the file ingest summary panel, with the underlying reason ("Document AI not configured", "pdf-parse failed", etc.) shown alongside.

No caching. Earlier versions cached extractions by file-hash. Empty results got cached too, so a PDF that failed once silently returned 0 rows on every subsequent audit, dropping hundreds of billed lines into "Not on Salesforce". The cache was removed; every audit re-extracts every PDF from scratch. pdf-parse is fast on text PDFs (~50 pages/second), so the cost is mild and the trade-off is worth it.

Verizon Quick Bill Summary handling: when the keyword is detected, the carrier pack opts in to a $50 price fallback for any line that lacks a per-line amount. Prevents leakage from going to $0 just because the PDF rolled charges into a single total.

18. 🧾 File ingest summary panel

Every audit results page ends with a panel showing one row per uploaded file:

Filename · Kind (Salesforce export / Carrier billing / PDF invoice / InControl device / InControl SIM / Unclassified) · Rows contributed · Reason / details.
Files producing ⚠ 0 rows float to the top with an amber row tint — impossible to miss.
A header chip counts how many files came up empty.
A footnote points at the most common cause (PDFs needing Document AI configured under Settings).

Why this exists: when an audit returns suspicious totals (e.g. 813 lines after dropping 470-page Verizon PDFs that should produce thousands), the operator needs to know which file dropped data. The activity log surfaces this in the moment but disappears the second you navigate away. This panel is persisted on the audit run.

19. 🛡 Reliability & data integrity

A handful of safeguards built up over time. Knowing they exist helps when an audit looks "off":

ICCID precision preserved. SheetJS reads cells as JS numbers by default, which silently corrupts 19+ digit ICCIDs (precision loss past 2⁵³). The parser uses raw: false + cellText: true so cells come back as their formatted text, keeping every digit.
Salesforce HTML-as-XLS support. Salesforce's "Export Details" outputs HTML masquerading as .xls. The parser sniffs the first bytes; HTML-shaped input is routed through SheetJS's explicit HTML reader.
Pipeline fairness. An earlier bug had the rules engine compute rows for three buckets (Add to Salesforce, Unbilled assets, No device attached) and then drop them at dashboard rebuild. Fixed: every bucket is now exhaustively included.
Per-row uniqueness.Verizon Quick Bill PDFs emit the same phone on multiple lines (monthly + usage + overage). The pipeline now appends a per-row index to the join key so multiple rows for the same phone don't collapse into one.
Audit run stub. The audit_runs row is inserted up-front (with placeholder counts) so phone-history foreign keys resolve while the rules engine is still running. Final counts/leakage update the same row at the end.
OCR cleanup is PDF-only. Character substitutions like O→0 and |→1 live exclusively in the PDF extraction path. v1 ran them on clean CSV/XLSX values and corrupted status fields; v2 strictly does not.
Deterministic AI cache.AI resolver verdicts are cached by SHA-256 of the row's evidence payload, so byte-identical evidence always produces the same verdict. Helpful for cost control on re-runs.
Activity log diagnostics.Each audit's live log includes SF index · N contract(s) · sample MTNs: … and Billed lines sample MTNs: … so format mismatches are obvious within the first second of an audit.

20. 🔐 Sign-in & the action audit trail

Sign-in is restricted to @metrowireless.com Google Workspace accounts. Anyone outside the domain gets a denied_domain entry in the sign-in log and cannot enter the app. Authentication is handled by Auth.js v5 (NextAuth) using Google as the identity provider.

Setup (one-time)

In Google Cloud Console, create an OAuth 2.0 Client ID of type "Web application".
Authorized redirect URI: https://<your-host>/api/auth/callback/google (and the same for http://localhost:3001 in dev).
Copy the client id + secret into .env.local as AUTH_GOOGLE_ID and AUTH_GOOGLE_SECRET.
Set AUTH_SECRET to a random 32+ byte string (run npx auth secret).
Optional: ALLOWED_EMAIL_DOMAIN overrides the default metrowireless.com.
Restart the server. Until those env vars are set, the app boots in auth-disabled mode (every request is allowed; the header shows an "Auth disabled" badge).

What gets recorded

users — one row per @metrowireless.com email seen. Tracks display name, picture, first sign-in, last sign-in.
sign_in_log — every sign-in attempt: success, denied for wrong domain, denied for unverified Google email, or error. Captures email, IP, and user agent.
audit_actions_log — every mutating API request. Captures user id + email, IP, user agent, the action name (e.g. phone.ignore), and a JSON snapshot of the payload (truncated to 4 KB).
phone_history_log — extended with by_user_id and by_email. Per-phone history pages now show "by jane@metrowireless.com from 1.2.3.4 · audit 1234abcd…" for every event.

Tracked actions

audit.create — uploading files to start an audit (user, slot counts, label).
phone.ignore / phone.unignore — single-row toggles.
phone.bulk_ignore / phone.bulk_unignore — bulk operations from the action bar.
error_report.submit — the ⚠ Report dialog (user, audit, row, email status).

Where to view it

/admin/users — all users, latest 100 actions, latest 50 sign-in attempts (incl. denied).
/phones/<identity> — per-phone timeline with operator email + IP on every event.

How IP capture works

API actions — read x-forwarded-for first (proxy-friendly), then x-real-ip, then null. Captured on every mutating request via lib/actions/log.ts.
Sign-in events — NextAuth's signIn callback doesn't receive the Request, so we read headers via next/headers() inside the callback and stamp them onto the sign_in_log row.
Local dev — you'll see ::1 (IPv6 loopback) on every event because all requests come from localhost. In production behind nginx / Cloudflare, the real client IP shows up via x-forwarded-for.

Auth gate (proxy.ts)

The route gate lives in src/proxy.ts — Next.js 16 renamed the convention from middleware.ts in 16.x. Public allow-list: /sign-in, /api/auth/*, /how-it-works, and static assets. Everything else redirects unauthenticated requests to /sign-in?callbackUrl=… and returns the user to the original page after a successful sign-in.

History preservation policy

Operator-stated requirement: data is permanent unless explicitly deleted. To honor that:

Every schema change is additive — new columns added with ALTER TABLE … ADD COLUMN defaulting to NULL so existing rows survive.
All log / history tables are append-only — code never UPDATEs or DELETEs from phone_history_log, sign_in_log, audit_actions_log, error_reports, email_log.
audit_runs + audit_rows persist forever; only scripts/reset-db.ts destroys them, and that's a manual operator action.
phone_records is upsert-only — total_audits_seen accumulates, first_seen_at is preserved.
Deprecated tables stay — e.g. file_extractions (the removed PDF cache) is still in the schema with a deprecation comment; its data is preserved even though no code reads or writes it.

Authorization tier in v1: anyone signed in has full access. We can layer admin / auditor / viewer roles later without schema changes — the user record is already in place.

21. ✂ To-be-canceled tracker + the Audited rename

"Audited" replaces "Ignored"

The bucket and checkbox previously labeled Ignored are now labeled Audited. Same mechanic as before — operator approves a phone in its current bucket, it sticks until the categorization changes — just clearer language. The internal DB enum remains ignored so all existing history entries are preserved unchanged.

"To be canceled" — the new sticky tracker

When you find a line that needs to be canceled, click ✂ To cancel in the row's action cell. The phone immediately moves to the new To be canceled bucket and is flagged in phone_cancellation_flag.

The flag is sticky — every future audit places the phone in the To-be-canceled bucket regardless of what the rules engine would otherwise decide.
The flag auto-clears the moment two conditions are both met: the phone is no longer on any uploaded carrier bill AND its Salesforce status reads canceled (or there's no SF match at all).
On the audit where auto-clear fires, a single ✓ Cancellation confirmed row appears in Working SIMs with high confidence — your one-time acknowledgment that the cancellation cycle closed. After that the phone simply doesn't surface anywhere.
Operator can manually un-flag any time via ↩ Un-cancel; the row falls back to whatever bucket the rules engine chose.
Every transition (flag, un-flag, auto-confirm) is logged to phone_history_log with the operator's email + IP + audit run.

Recent fixes shipped with this release

Data2Go matching: Salesforce Asset rows for Data2Go carry the ICCID in the asset-name field (no phone). The SF extractor now falls through to ICCID parsing when the asset name doesn't contain a real phone, and normalizeMtn rejects 18+ digit strings starting with "89" (the ICCID industry prefix) so they don't show up as fake phones like 8901030330.
Report button hang: the ⚠ Report submit raced against an 8-second SMTP timeout. Slow or unreachable mail servers no longer leave the modal stuck on "Sending…" — the report is recorded with emailStatus = "failed" and the dialog returns immediately.
Bucket count: 9 → 10 with To be canceled. Tab grid widened to fit on a single line at xl screen sizes.

API: POST /api/phones/<id>/cancel with body { fromBucket, auditRunId, note? } · DELETE /api/phones/<id>/cancel to un-flag.

22. 📋 Reports tracker + global header nav

From email-only to a triageable list

The ⚠ Report button used to send an email and that was it. Now every submission becomes a row in the new /reports page — status-tracked, searchable, and resolvable in-app. Email is still sent (best-effort) but is no longer the primary record.

Status states: Open → In progress → Fixed or Won't fix. New submissions default to Open.
List view shows status, submitter email, phone/ICCID, original bucket, and the description preview.
Detail view (/reports/[id]) shows the full row snapshot at the time of report (so even if the underlying audit gets re-run, the original context is preserved), the description, suggested fix, and a status-update form with an optional resolution note.
Resolution audit trail: closing a report records who did it (user id + email), when, and the optional note. The same event is written to audit_actions_log as report.fixed / report.wont_fix, viewable on /admin/users.
Reporter identity is captured automatically from the signed-in session — no need to type your name.

Header nav now persists across every page

The header strip (added to the root layout) renders on every page — audit results, reports, settings, how-it-works, sign-in, everything. Links: Audits · Reports · Users · Settings · How it works. The active link is highlighted, so you can always see where you are. No more "Settings was only on the home page."

Schema changes (additive)

error_reports.status defaults to "open" — existing rows are treated as Open and surface in the list immediately.
Added reporter_email, reporter_user_id, resolution_note, resolved_by_user_id, resolved_by_email, resolved_at. All nullable.
Indexes on status and reported_at for the list view's sort and future filters.

MW Audit · automated reconciliation · v0.14.0 — May 8, 2026