Jump to content

Convert PDF→Excel: extra spaces in words randomly


Nate  Schley

Recommended Posts

I had 3 PDF (v1.7) files with tables I converted to Excel.  I started by deleted extraneous pages, then deleting extraneous content on the remaining pages so that the only content was the tables of interest.  Generally, the conversion process worked like a charm.

The problem is so many of the words have extraneous spaces in them, and I have to manually re-group the words.

Examples of words I need to manually correct:  "b rake", "br ake", "bra ke", "un it", "ou t", "ope n", "supp ress"

I saw one thread in the community citing this issue.  The response referred to OCR settings in the user guide.  The Nitro Pro settings were already set as advised. I noticed OCR→Recognition→Qualty = medium, so I changed it to High & re-converted.   There was no appreciable difference in the conversion output.

Here are relevent app settings:

  • Conversion
    • Advanced recovery
      • All text
      • language = english
      • automatically OCR image-based docs
  • OCR
    • correct image skew - checked (I have no graphics beyond table lines)
    • Fixed threshold - unchecked
    • detect text orientation - unchecked (all text is horizontal, L-R)
    • Recognition
      • language = english
      • quality = high
    • Output
      • use system font based recognition - unchecked (all fonts seem to be embedded in the PDF)

I'm using Nitro Pro 13.16.2.300

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.