Convert PDF→Excel: extra spaces in words randomly

October 9, 2020

I had 3 PDF (v1.7) files with tables I converted to Excel. I started by deleted extraneous pages, then deleting extraneous content on the remaining pages so that the only content was the tables of interest. Generally, the conversion process worked like a charm.

The problem is so many of the words have extraneous spaces in them, and I have to manually re-group the words.

Examples of words I need to manually correct: "b rake", "br ake", "bra ke", "un it", "ou t", "ope n", "supp ress"

I saw one thread in the community citing this issue. The response referred to OCR settings in the user guide. The Nitro Pro settings were already set as advised. I noticed OCR→Recognition→Qualty = medium, so I changed it to High & re-converted. There was no appreciable difference in the conversion output.

Here are relevent app settings:

Conversion
- Advanced recovery
  - All text
  - language = english
  - automatically OCR image-based docs
OCR
- correct image skew - checked (I have no graphics beyond table lines)
- Fixed threshold - unchecked
- detect text orientation - unchecked (all text is horizontal, L-R)
- Recognition
  - language = english
  - quality = high
- Output
  - use system font based recognition - unchecked (all fonts seem to be embedded in the PDF)

I'm using Nitro Pro 13.16.2.300

Sign In

Convert PDF→Excel: extra spaces in words randomly

Recommended Posts

Nate Schley

Link to comment

Share on other sites

Create an account or sign in to comment

Create an account

Sign in

Browse

Activity

Important Information