Ninjadoc AI – Precise Document Q&A for Developers
Ninjadoc AI is the first document‑question‑answer platform that returns not only the extracted value but also the exact bounding‑box coordinates where the data lives in the source file. Built for developers, it offers a visual Q&A Schema Builder, a clean REST Extraction API, and a TypeScript SDK with a React HighlightOverlay component for interactive visual verification.
Key Features
- Visual Q&A Schema Builder – Define extraction fields by asking natural‑language questions (e.g., "What is the total?") or by specifying technical field names. The builder creates reusable processors for any document type.
- Coordinate Proof – Every field is returned with a geometry array (four corner points) and a confidence score, enabling verifiable extraction and on‑screen highlighting.
- React HighlightOverlay SDK – Drop‑in component that renders the coordinates as interactive overlays, perfect for building UI that lets users inspect the source of each answer.
- Developer‑First REST API – Simple
POST /api/extract
endpoint; send a PDF and a processor ID, receive structured JSON with values, coordinates, and metadata. - Pay‑Per‑Use Pricing – 1,250 free credits to start, then transparent credit‑based pricing (25 credits/page + 5 credits/field). No hidden fees, no long‑term contracts.
- Multi‑Domain Support – Ready‑made schemas for invoices, contracts, IDs, medical forms, bills of lading, and a generic “any document” mode.
- Self‑Learning AI Core – Continuous model improvement from anonymized data, delivering >95 % accuracy across varied layouts.
Typical Use Cases
Domain | Example Question | Outcome |
---|---|---|
Finance | "What is the total amount?" | Extracts total, subtotal, tax with coordinates for invoices and receipts. |
Legal | "When does the contract end?" | Returns contract_end_date and highlights the clause location. |
Identity Verification | "Is the holder over 21?" | Provides boolean answer plus coordinates on the ID document. |
Healthcare | "List patient allergies." | Pulls allergy data from intake forms with exact location for EMR integration. |
Logistics | "Who is the consignee?" | Retrieves consignee name from bills of lading and shows its position. |
Batch Processing | "Extract invoice numbers from 500 PDFs." | Scales via API, returns a JSON array of results with coordinates for each file. |
Quick Integration Steps
- Create a Schema – Use the visual builder or API to ask questions and generate a processor ID.
- Install the SDK –
npm i @ninjadoc/sdk
and importHighlightOverlay
into your React app. - Call the Extraction API:
curl -X POST "https://ninjadoc.ai/api/extract" \ -H "X-API-Key: YOUR_API_KEY" \ -F "document=@/path/to/file.pdf" \ -F "processor_id=invoice_qa_processor"
- Render Results – Pass the returned geometry to
HighlightOverlay
to display interactive boxes over the original PDF.
FAQ
- Do I need to train a model? No. The Q&A schema is built by asking questions; the underlying AI handles layout variations automatically.
- What formats are supported? PDFs, images (PNG, JPG), and multi‑page documents.
- How accurate is the extraction? Reported >95 % accuracy with confidence scores for each field.
- Can I run it on‑prem? Currently offered as a hosted SaaS with a public API.
- What happens after free credits? Credits are purchased in bundles (4,500 credits for $10, 10,000 credits for $25, etc.).
- Is my data private? Documents are processed in memory and deleted after extraction; no persistent storage unless you enable logging.
Ready to replace brittle OCR pipelines and generic LLM calls? Start building your first Q&A schema today and get 1,250 free credits.