PDF inventory
Page last updated 30 April 2026
The PDF inventory collates all PDFs found on your website during an Insytful scan.
It provides a single view of every PDF, its metadata and its accessibility status, so you can quickly identify PDFs that need attention.
This is particularly useful if you're auditing a large site for accessibility compliance, cleaning up legacy documents, or preparing for regulations like WCAG 2.2 or the European Accessibility Act.
What information does Insytful scan for?
During a scan, Insytful collects the following for each PDF it finds:
Document URL and title
The web address where the PDF is hosted and the title set in the document's metadata. A missing or generic title ("Untitled" or a filename like report_final_v3.pdf) makes the document harder for users and search engines to identify.
Occurrences
The number of pages on your site that link to or embed this PDF. A high count means the document is widely referenced, so fixing any issues with it will have a bigger impact across your site.
Accessibility score
A rating that reflects how accessible the PDF is, based on factors like tagged structure, reading order, alternative text for images, and metadata completeness. A low score indicates barriers for users who rely on assistive technologies such as screen readers.
Author
The person or organisation recorded in the document's metadata. This helps with document ownership and accountability, especially when managing a large content library across multiple teams.
Machine readable
Whether the PDF contains actual text that software can read and interpret, or whether it's essentially a scanned image. If a PDF isn't machine readable, screen readers cannot parse its content, making it inaccessible to visually impaired users. These documents typically need OCR (Optical Character Recognition) processing to become accessible.
Created date and last modified date
When the document was first created and when it was last changed. Useful for spotting outdated content that may need reviewing, updating, or removing.
Language
The language is set in the document's metadata. This tells screen readers which language to use when reading the document aloud. Without it, a screen reader may default to the wrong language, making the content unintelligible.
Size (KB and MB)
The file size of the document. Large PDFs can cause slow downloads and poor user experience, particularly on mobile devices or slower connections.
Missing metadata
If a document has missing metadata, Insytful highlights it on the PDF inventory page so you can address it. Missing metadata is one of the most common and most easily fixed PDF accessibility issues. Insytful flags the following:
No document title
The PDF has no title set in its properties. Users see this title in browser tabs, search results, and screen reader announcements. Without it, they may only see a filename, which is often meaningless.
No author data
No author is recorded. While this is less critical for end-user accessibility, it makes document governance more difficult and may be a compliance requirement in some sectors.
No language data
No language tag is set. As noted above, this directly affects how screen readers pronounce the content. For multilingual sites, this is especially important.
No keywords detected
The document contains no keyword metadata. Keywords help with search engine discoverability and internal content categorisation.
For a deeper look at why this matters, see our guide PDF metadata: what it is, why it's useful, and how it can help.
What should I do about flagged PDFs?
Once you've identified PDFs with issues, here are the typical next steps:
For missing metadata
Open the PDF in a tool that supports metadata editing (such as Adobe Acrobat Pro, or free alternatives like ExifTool). Then fill in the title, author, language, and keywords fields and re-upload the corrected file to your site.
For PDFs that aren't machine-readable
These are usually scanned documents saved as images. Run them through an OCR tool to convert the image content into selectable, searchable text. Adobe Acrobat, ABBYY FineReader, and other open-source tools can handle this.
For low accessibility scores
Review the PDF's tag structure, reading order, and image alt text. Remediation can range from quick metadata fixes to more involved structural work or replacing it with HTML, depending on how the document was originally created.
For large file sizes
Consider compressing the PDF or, where appropriate, converting the content to an HTML page for a better web experience.
For outdated documents
If the created or modified dates suggest a document is stale, review whether it should be updated, replaced, or removed from the site entirely.
Still need help?
If you still need help, reach out to the Insytful community on Slack or raise a support ticket for assistance.