How it works
The audit, in plain English
This page describes exactly what the engine does. It's kept in sync with the live rules — if you spot a row whose behaviour disagrees with what you read here, that's a bug: click ⚠ Report on the row and we'll fix it.
Engine: v0.14.0 — May 8, 2026
1. The one question
For every line on every carrier bill: "Is this phone number in our Salesforce, and what should we do about it?"
Then the reverse: "For every active Salesforce contract, did it appear on any uploaded bill?"That second pass is the "Unbilled assets" bucket.
2. The nine buckets
Every billed line — and every active Salesforce contract — ends up in exactly one bucket. Listed here in the priority order they appear in the audit results page (most actionable first; healthy & ignored last).
Ghost devices
Hardware appears dead. Cancel + retrieve.
All these must be true
- •Billed on the carrier bill
- •Phone matches Salesforce active
- •Bill shows zero or no usage
- •InControl shows device offline 30+ days (configurable in Settings)
- •Duplicate SIM ICCID on multiple devices: keep the online device, or the most recently online if all are offline
Action
Cancel the SIM at the carrier and retrieve the hardware.
Add to Salesforce
Billed and confirmed working — but missing from CRM.
All these must be true
- •Billed on the carrier bill
- •Phone is NOT in Salesforce
- •InControl confirms the device is ONLINE
Action
Add the Asset to Salesforce. We own it and we know it works.
Not on Salesforce
Billed line, missing from CRM, no InControl confirmation.
All these must be true
- •Billed on the carrier bill
- •Phone is NOT in Salesforce
- •InControl can't confirm (offline 60+ days OR no device record)
Action
Manual search (tickets / emails / order history) before canceling. If you find a record we own → add to SF; otherwise → cancel.
Unbilled assets (reverse audit)
Active in Salesforce but NOT on any carrier bill.
All these must be true
- •Salesforce status is Active or Deployed
- •The phone (or ICCID, for Frontier) doesn't appear on any uploaded carrier bill
Action
Verify line status with the carrier. May be canceled at the carrier without CRM update — risky for failover.
SIM swaps & failover
MTN/ICCID drift OR healthy failover SIMs.
All these must be true
- •(swap) Phone matches SF, but bill ICCID differs from SF ICCID
- •(missing-iccid) Phone matches SF, SF has NO ICCID on file → add the bill's ICCID to SF
- •(failover) Zero usage but InControl shows device online
Action
Verify, then either update Salesforce SIM ID, add the ICCID to SF, or leave alone (failover is doing its job).
No device attached
Billed line in CRM but no hardware in InControl.
All these must be true
- •Billed on the carrier bill
- •Phone matches Salesforce active
- •Bill shows zero or no usage
- •InControl has NO device record for this SIM
Action
Confirm: SIM-only sale, cold spare, or on-site backup SIM? If none of those, treat as ghost candidate.
Manual audit needed
True edge cases — kept as a safety net.
All these must be true
- •The rules engine couldn't fit this row into a clearer category
- •(in practice currently rare; any future rule branch that can't decide routes here)
Action
Human review of bill + Salesforce + InControl context.
Working SIMs
Healthy. No action needed.
All these must be true
- •Phone is in Salesforce active
- •Carrier bill reports usage > 0 MB, OR
- •Bill has no usage column AND InControl confirms the device is online
Action
Leave it alone.
Ignored
Operator-approved phones. Hidden from their natural bucket — until they move.
All these must be true
- •Operator clicked the Ignore checkbox on a row
- •All of the phone's billed lines are still in the bucket the operator approved
- •Auto-clears the second any of the phone's lines lands in a DIFFERENT bucket
Action
Nothing. The system has remembered your earlier review.
3. The decision tree
For every billed line the engine walks the forward audit; then for every Salesforce active contract it runs the reverse audit.
Forward audit (per billed line)
Reverse audit (per Salesforce active contract)
4. 📞 Phone number is the master key
The audit matches by phone for every carrier (one exception below — Frontier). Phones are normalized so all of these point to the same line:
(555) 123-4567+1 555-123-45671.555.123.45675551234567MTN: 555-123-4567sub: +1 (555) 123-45675551234567 ext 22555.123.4567 x100Column names in carrier files come in many flavors — all of these are recognized as the phone column:MTN · MDN · MSISDN · sub · sub: · subscriber · PTN · Service_TN · CURR_SRV_ACCS_NBR · Wireless Number · Mobile Number · Phone · Line Number · Contact
5. 🟡 Frontier — the one exception
Frontier doesn't put real phone numbers on their bills. Their Service_TN column actually holds the SIM ICCID (a 19-digit number starting with 891030…).
So for Frontier — and only Frontier — we match by ICCID instead of by phone. Salesforce records for Frontier supplier rows must have the SIM ID populated for matching to work.
6. ⛔ InControl is verification only
The audit list comes from carrier bills. InControl is never a source of audit candidates — only a witness for SIMs already on a bill.
- ✅ Bill says X is billed → look up X in InControl → use status as evidence
- ✅ Salesforce says contract Y is active → check whether Y appears on any bill
- ❌ NEVER: take an InControl SIM and ask "is this in Salesforce?" — that direction is forbidden
7. ICCID verification (only when phone matches)
Once a phone is matched to Salesforce, we check whether the bill's ICCID matches what SF has. There are three states:
match
Bill ICCID is in SF's contract ICCIDs (or bill has no ICCID). No action needed.
missing_in_sf
Phone matches SF but SF has no ICCID on file. Add the bill's ICCID to the SF Asset.
mismatch_in_sf
SF has a different ICCID than the bill. Verify SIM swap with the carrier.
We never recommend ICCID actions when the phone isn't in Salesforce — first deal with the missing phone record; the SIM ID is downstream.
8. How files get into the engine
The upload screen has three explicit slots — the operator decides where each file goes. This eliminates the "wrong file in the wrong pile" failure mode that header-only classification can hit on edge cases.
1. Salesforce export *
Exactly one file. The source of truth — every billed line is matched against this.
2. InControl support files
Device list + SIM info. Used only to validate / enrich — never as a bill.
3. Carrier bills *
Verizon, T-Mobile, AT&T, Frontier, Data2Go, RH, etc. The data being audited.
Within each slot, the engine still inspects column headers to decide which extractor to run — the slot tells it what category of file this is, the headers tell it which carrier / sub-shape.
Salesforce export
Recognized by: Service Account · SIM ID · Supplier · MTN · Assets: Assets
InControl device list
Recognized by: S/N · Online Status · Online/Offline
InControl SIM info
Recognized by: SIM Slot · IMSI · ICCID
Verizon bill
Recognized by: Filename has 'Verizon' OR 'Quick Bill Summary' header
Frontier bill
Recognized by: Service_TN · Service_Circuit_Number · Service_Status
AT&T bill
Recognized by: BAN · FAN · Wireless Number columns OR filename
T-Mobile bill
Recognized by: MSISDN · Subscriber Number OR filename has 'T-Mobile' / 'T-Mo'
Data2Go bill
Recognized by: Filename has 'Data2Go' + 'Carrier Name' column (reseller — provider per row)
RH bill
Recognized by: Filename has 'RH' (AT&T reseller)
Generic carrier
Recognized by: Catch-all if a file has identifier + usage/amount columns
9. Confidence levels
Every row carries a confidence badge. Color matches what you see in the audit results.
high
Strong, unambiguous evidence. Multiple data sources agree. Act on it.
medium
Rule fired but evidence has known false-positive modes. Verify before acting.
low
Edge case the rules engine couldn't decide. Human review required.
10. 🤖 AI assist on edge cases
Everything above runs as deterministic JavaScript — no AI involved. AI only enters the picture for rows the rules engine couldn't confidently categorize (specifically: manualAuditNeeded rows or those flagged low confidence).
- ✅ Gemini sees the same evidence the rules engine saw and returns a refined bucket + plain-English narrative.
- ✅ Every AI verdict is cached in SQLite by a hash of the row's evidence — same input gives same output forever (deterministic).
- ✅ Off by default unless
GEMINI_API_KEYis set in Settings AND the "Use AI on edge cases" toggle is on. - ❌ AI is never used for the deterministic rules above. The bulk of the audit is byte-identical between runs even with AI off.
11. 💵 How leakage is calculated
The dollar number on the dashboard is computed from the categorized output — never AI-estimated.
| Bucket | Counts toward leakage? | Fallback when amount missing |
|---|---|---|
| Ghost devices | Yes | $25 / mo |
| Add to Salesforce | Yes — we're paying for it | $25 / mo |
| Not on Salesforce | Yes | $25 / mo |
| Unbilled assets | No (not on a bill) | — |
| SIM swaps & failover | No (inventory issue) | — |
| No device attached | No (ambiguous) | — |
| Manual audit needed | No (no claim without justification) | — |
| Working SIMs | No (healthy) | — |
12. ⚠ Spotted a row that's wrong?
Click ⚠ Report on any row in the audit results. A modal opens with the full row context. Describe what's wrong — we get an email with everything: the row's data, the engine's evidence, the recommendation it gave, and your description.
Then we tune the rule and add a regression test so the same false positive can never silently return. The engine learns from real use; this page updates to reflect the new behaviour. Every change to the rules updates this page in the same release.
13. 🚨 Salesforce-cancelled detection
When a row is matched to a Salesforce contract whose status reads Cancelled, Disconnected, Suspended, Terminated, Decommissioned, Inactive (or any close variant), it lands in Ghost devices with the highest possible confidence — no InControl heuristic needed. Salesforce is already telling us the line should be off; the only remaining action is the carrier-side cancellation.
The status vocabulary is intentionally broad to catch operator typos and carrier-specific words. "Cancelled", "cancelled ", "Disconnected", "Inactive" all map to the same canceled state.
Limitation worth knowing: this rule looks at the asset-level status. If your Salesforce export includes a separate parent Service Account status column (e.g. account-level cancellation), it isn't currently joined in. Add it as a column to the export and we'll wire it into the rule.
14. ☁︎ Direct Salesforce links
Every row whose ICCID/MTN matched a Salesforce record now shows an ☁︎ Open in SF chip under the SIM ICCID. Click it to open the live record in a new tab — no searching, no copy-paste.
The link is built from the Assets: ID column in your Salesforce export (values like a1ZHs00000OIVssMAH) and points at: metrowireless.lightning.force.com/lightning/r/Assets__c/<id>/view.
The asset ID is also included in CSV exports (column Salesforce Asset ID) and in the error-report email so anyone reviewing a flagged row can jump straight to the SF record.
Rows in the Not on Salesforce bucket don't have a SF link (by definition — they're missing from CRM). Rows in Unbilled assets (reverse audit) DO have links since they came from a SF record in the first place.
15. 🗂 Phone history & the Ignore mechanism
The system records every phone (or Frontier ICCID-only identity) it has ever seen across all audits. Each phone has a permanent record showing first-seen / last-seen, total audits it appeared in, the bucket it currently sits in, and a full timeline of every ignore, un-ignore, auto-clear, and per-audit categorization. Three SQLite tables back this: phone_records, phone_ignore_state, phone_history_log.
Each row has an Ignorecheckbox. Tick it to tell the engine "I've already reviewed this phone and approved it staying in this bucket." The phone instantly moves to the new Ignoredtab and won't surface in its current bucket on future audits.
Auto-clear safety net:the moment any of that phone's billed lines moves to a different bucket on a future audit (e.g. it shifts from Ghost devices to Unbilled assets), the ignore is automatically cleared and the row resurfaces. The system logs auto_cleared_bucket_changein the phone's history so the operator can see the transition.
Bulk operations. Select multiple rows with the row checkboxes, then use Ignore selected / Un-ignore selected in the bulk-action bar that appears at the top of the table. Every individual action is logged with timestamp + originating IP, even when triggered from a bulk request.
Per-phone history page. Click any phone number in the audit table to open /phones/<identity>. That page shows the phone's record summary (MTN, ICCID, first/last seen, last bucket, last carrier, total audits seen), its current ignore state if any, and a styled timeline of every event in chronological order with action labels (Ignored / Un-ignored / Auto-cleared / Categorized in audit), the bucket involved, the IP that triggered the action, and a clickable link back to the audit run that produced each timeline entry.
Identity scheme: phone-matched rows use the normalized 10-digit MTN as their identity. Frontier rows (which have only ICCID, no phone) use iccid:<digits> instead. Both can be ignored.
API surface: POST/DELETE /api/phones/[identity]/ignore, POST /api/phones/bulk-ignore, POST /api/phones/bulk-unignore, GET /api/phones/[identity]/history.
16. 📊 Per-audit dashboard
Every audit results page now opens with a strip of widgets above the bucket tabs:
- Headline KPIs — total categorized lines, leakage USD, ignored phone count, distinct carriers, source-file count.
- Bucket distribution — horizontal bar per bucket, sorted by count.
- By carrier — counts grouped by detected carrier (Verizon, T-Mobile, AT&T, Frontier, Data2Go, RH, etc.).
- By source file — counts grouped by the bill / file each row came from.
- Carrier × bucket cross-tab — a small grid showing which carrier dominates which bucket, so e.g. you can see at a glance that all your ghost devices are Verizon.
Pure CSS/SVG bars — no chart library. The whole dashboard is server-rendered, so it appears instantly when the audit page loads.
17. 📄 PDF extraction (and why we don't cache)
Carrier PDFs (Verizon Bill Summary, RH, T-Mobile, etc.) go through a two-stage extractor:
- pdf-parse (free, fast) reads the text layer of the PDF and runs regex over it for ICCIDs, phones, and prices. OCR character substitutions (
O→0,|→1) and zero-width Unicode strip happen here only. - If pdf-parse returns < 1 row AND Document AI is configured (Settings → Document AI), the PDF is split into 15-page chunks and sent to DocAI for proper table extraction. This is paid per page; the first stage is what we lean on for most bills.
- If both come up empty, the file is reported as ⚠ no rows extracted in the activity log AND in the file ingest summary panel, with the underlying reason ("Document AI not configured", "pdf-parse failed", etc.) shown alongside.
No caching. Earlier versions cached extractions by file-hash. Empty results got cached too, so a PDF that failed once silently returned 0 rows on every subsequent audit, dropping hundreds of billed lines into "Not on Salesforce". The cache was removed; every audit re-extracts every PDF from scratch. pdf-parse is fast on text PDFs (~50 pages/second), so the cost is mild and the trade-off is worth it.
Verizon Quick Bill Summary handling: when the keyword is detected, the carrier pack opts in to a $50 price fallback for any line that lacks a per-line amount. Prevents leakage from going to $0 just because the PDF rolled charges into a single total.
18. 🧾 File ingest summary panel
Every audit results page ends with a panel showing one row per uploaded file:
- Filename · Kind (Salesforce export / Carrier billing / PDF invoice / InControl device / InControl SIM / Unclassified) · Rows contributed · Reason / details.
- Files producing ⚠ 0 rows float to the top with an amber row tint — impossible to miss.
- A header chip counts how many files came up empty.
- A footnote points at the most common cause (PDFs needing Document AI configured under Settings).
Why this exists: when an audit returns suspicious totals (e.g. 813 lines after dropping 470-page Verizon PDFs that should produce thousands), the operator needs to know which file dropped data. The activity log surfaces this in the moment but disappears the second you navigate away. This panel is persisted on the audit run.
19. 🛡 Reliability & data integrity
A handful of safeguards built up over time. Knowing they exist helps when an audit looks "off":
- ICCID precision preserved. SheetJS reads cells as JS numbers by default, which silently corrupts 19+ digit ICCIDs (precision loss past 253). The parser uses
raw: false+cellText: trueso cells come back as their formatted text, keeping every digit. - Salesforce HTML-as-XLS support. Salesforce's "Export Details" outputs HTML masquerading as
.xls. The parser sniffs the first bytes; HTML-shaped input is routed through SheetJS's explicit HTML reader. - Pipeline fairness. An earlier bug had the rules engine compute rows for three buckets (Add to Salesforce, Unbilled assets, No device attached) and then drop them at dashboard rebuild. Fixed: every bucket is now exhaustively included.
- Per-row uniqueness.Verizon Quick Bill PDFs emit the same phone on multiple lines (monthly + usage + overage). The pipeline now appends a per-row index to the join key so multiple rows for the same phone don't collapse into one.
- Audit run stub. The
audit_runsrow is inserted up-front (with placeholder counts) so phone-history foreign keys resolve while the rules engine is still running. Final counts/leakage update the same row at the end. - OCR cleanup is PDF-only. Character substitutions like
O→0and|→1live exclusively in the PDF extraction path. v1 ran them on clean CSV/XLSX values and corrupted status fields; v2 strictly does not. - Deterministic AI cache.AI resolver verdicts are cached by SHA-256 of the row's evidence payload, so byte-identical evidence always produces the same verdict. Helpful for cost control on re-runs.
- Activity log diagnostics.Each audit's live log includes
SF index · N contract(s) · sample MTNs: …andBilled lines sample MTNs: …so format mismatches are obvious within the first second of an audit.
20. 🔐 Sign-in & the action audit trail
Sign-in is restricted to @metrowireless.com Google Workspace accounts. Anyone outside the domain gets a denied_domain entry in the sign-in log and cannot enter the app. Authentication is handled by Auth.js v5 (NextAuth) using Google as the identity provider.
Setup (one-time)
- In Google Cloud Console, create an OAuth 2.0 Client ID of type "Web application".
- Authorized redirect URI:
https://<your-host>/api/auth/callback/google(and the same forhttp://localhost:3001in dev). - Copy the client id + secret into
.env.localasAUTH_GOOGLE_IDandAUTH_GOOGLE_SECRET. - Set
AUTH_SECRETto a random 32+ byte string (runnpx auth secret). - Optional:
ALLOWED_EMAIL_DOMAINoverrides the defaultmetrowireless.com. - Restart the server. Until those env vars are set, the app boots in auth-disabled mode (every request is allowed; the header shows an "Auth disabled" badge).
What gets recorded
- users — one row per @metrowireless.com email seen. Tracks display name, picture, first sign-in, last sign-in.
- sign_in_log — every sign-in attempt: success, denied for wrong domain, denied for unverified Google email, or error. Captures email, IP, and user agent.
- audit_actions_log — every mutating API request. Captures user id + email, IP, user agent, the action name (e.g.
phone.ignore), and a JSON snapshot of the payload (truncated to 4 KB). - phone_history_log — extended with
by_user_idandby_email. Per-phone history pages now show "by jane@metrowireless.com from 1.2.3.4 · audit 1234abcd…" for every event.
Tracked actions
audit.create— uploading files to start an audit (user, slot counts, label).phone.ignore/phone.unignore— single-row toggles.phone.bulk_ignore/phone.bulk_unignore— bulk operations from the action bar.error_report.submit— the ⚠ Report dialog (user, audit, row, email status).
Where to view it
- /admin/users — all users, latest 100 actions, latest 50 sign-in attempts (incl. denied).
/phones/<identity>— per-phone timeline with operator email + IP on every event.
How IP capture works
- API actions — read
x-forwarded-forfirst (proxy-friendly), thenx-real-ip, then null. Captured on every mutating request vialib/actions/log.ts. - Sign-in events — NextAuth's
signIncallback doesn't receive the Request, so we read headers vianext/headers()inside the callback and stamp them onto thesign_in_logrow. - Local dev — you'll see
::1(IPv6 loopback) on every event because all requests come from localhost. In production behind nginx / Cloudflare, the real client IP shows up viax-forwarded-for.
Auth gate (proxy.ts)
The route gate lives in src/proxy.ts — Next.js 16 renamed the convention from middleware.ts in 16.x. Public allow-list: /sign-in, /api/auth/*, /how-it-works, and static assets. Everything else redirects unauthenticated requests to /sign-in?callbackUrl=… and returns the user to the original page after a successful sign-in.
History preservation policy
Operator-stated requirement: data is permanent unless explicitly deleted. To honor that:
- Every schema change is additive — new columns added with
ALTER TABLE … ADD COLUMNdefaulting to NULL so existing rows survive. - All log / history tables are append-only — code never UPDATEs or DELETEs from
phone_history_log,sign_in_log,audit_actions_log,error_reports,email_log. - audit_runs + audit_rows persist forever; only
scripts/reset-db.tsdestroys them, and that's a manual operator action. - phone_records is upsert-only —
total_audits_seenaccumulates,first_seen_atis preserved. - Deprecated tables stay — e.g.
file_extractions(the removed PDF cache) is still in the schema with a deprecation comment; its data is preserved even though no code reads or writes it.
Authorization tier in v1: anyone signed in has full access. We can layer admin / auditor / viewer roles later without schema changes — the user record is already in place.
21. ✂ To-be-canceled tracker + the Audited rename
"Audited" replaces "Ignored"
The bucket and checkbox previously labeled Ignored are now labeled Audited. Same mechanic as before — operator approves a phone in its current bucket, it sticks until the categorization changes — just clearer language. The internal DB enum remains ignored so all existing history entries are preserved unchanged.
"To be canceled" — the new sticky tracker
When you find a line that needs to be canceled, click ✂ To cancel in the row's action cell. The phone immediately moves to the new To be canceled bucket and is flagged in phone_cancellation_flag.
- The flag is sticky — every future audit places the phone in the To-be-canceled bucket regardless of what the rules engine would otherwise decide.
- The flag auto-clears the moment two conditions are both met: the phone is no longer on any uploaded carrier bill AND its Salesforce status reads canceled (or there's no SF match at all).
- On the audit where auto-clear fires, a single ✓ Cancellation confirmed row appears in Working SIMs with high confidence — your one-time acknowledgment that the cancellation cycle closed. After that the phone simply doesn't surface anywhere.
- Operator can manually un-flag any time via ↩ Un-cancel; the row falls back to whatever bucket the rules engine chose.
- Every transition (flag, un-flag, auto-confirm) is logged to
phone_history_logwith the operator's email + IP + audit run.
Recent fixes shipped with this release
- Data2Go matching: Salesforce Asset rows for Data2Go carry the ICCID in the asset-name field (no phone). The SF extractor now falls through to ICCID parsing when the asset name doesn't contain a real phone, and
normalizeMtnrejects 18+ digit strings starting with "89" (the ICCID industry prefix) so they don't show up as fake phones like 8901030330. - Report button hang: the ⚠ Report submit raced against an 8-second SMTP timeout. Slow or unreachable mail servers no longer leave the modal stuck on "Sending…" — the report is recorded with
emailStatus = "failed"and the dialog returns immediately. - Bucket count: 9 → 10 with To be canceled. Tab grid widened to fit on a single line at xl screen sizes.
API: POST /api/phones/<id>/cancel with body { fromBucket, auditRunId, note? } · DELETE /api/phones/<id>/cancel to un-flag.
22. 📋 Reports tracker + global header nav
From email-only to a triageable list
The ⚠ Report button used to send an email and that was it. Now every submission becomes a row in the new /reports page — status-tracked, searchable, and resolvable in-app. Email is still sent (best-effort) but is no longer the primary record.
- Status states: Open → In progress → Fixed or Won't fix. New submissions default to Open.
- List view shows status, submitter email, phone/ICCID, original bucket, and the description preview.
- Detail view (
/reports/[id]) shows the full row snapshot at the time of report (so even if the underlying audit gets re-run, the original context is preserved), the description, suggested fix, and a status-update form with an optional resolution note. - Resolution audit trail: closing a report records who did it (user id + email), when, and the optional note. The same event is written to
audit_actions_logasreport.fixed/report.wont_fix, viewable on /admin/users. - Reporter identity is captured automatically from the signed-in session — no need to type your name.
Header nav now persists across every page
The header strip (added to the root layout) renders on every page — audit results, reports, settings, how-it-works, sign-in, everything. Links: Audits · Reports · Users · Settings · How it works. The active link is highlighted, so you can always see where you are. No more "Settings was only on the home page."
Schema changes (additive)
error_reports.statusdefaults to"open"— existing rows are treated as Open and surface in the list immediately.- Added
reporter_email,reporter_user_id,resolution_note,resolved_by_user_id,resolved_by_email,resolved_at. All nullable. - Indexes on
statusandreported_atfor the list view's sort and future filters.
MW Audit · automated reconciliation · v0.14.0 — May 8, 2026