PeerHosting / APIs / PDF Text Extractor
Live

PDF Text Extractor - URL to clean text, per page

Give it public PDF URLs, get back clean text and document metadata. One block per page or per document, batch-capable, and callable as a synchronous API so AI agents and automations can extract PDFs on demand. No OCR needed for digital PDFs, no upload step, no key.

What it does

What you get per document

{
  "url": "https://arxiv.org/pdf/1706.03762",
  "pageCount": 15,
  "pagesExtracted": 15,
  "truncated": false,
  "metadata": { "producer": "pdfTeX", "creationDate": "..." },
  "pages": [
    { "page": 1, "text": "Attention Is All You Need\n..." }
  ]
}

Use cases

Pricing

EventPrice (USD)
Run start$0.0005
Per page extracted$0.002
API call (standby)$0.02

Comparable actors charge $0.022-0.04 per page. A 100-page report here costs about $0.20. The live price on the Apify page is authoritative.

Runs on Apify - free account, pay per page, batch or synchronous standby endpoint for agents.

Run on Apify →

Related: Structured Data Extractor · SEC EDGAR Filing Monitor