Jump to content
  • 0

Advanced Annotation Management Issue



Annotation Management Issue


Hi everyone, I'm not sure if this is the right place to publish this text, if this is not the case let me know.

I posted this message on many forums because I’m desperate.

I write from a researcher's point of view but I think the problem of annotations affects anyone who works extensively with annotations, even in the business community.

The problem I am raising is one of the biggest obstacle to a paperless world.

Because of this problem, it is almost impossible to work only on screen, without printing PDF files and annotation summaries. We are forced to use a mixed system: working at the same time on screen and on paper, which becomes a nightmare for medium and big research projects. It is almost worst than the old index cards system.

Most professional PDF software mixes at least two functions: 1) a PDF file editing function (rotating pages, adding or suppressing pages, cropping, etc.) and 2) an annotation function. The second function is the least well developed of the two. Given its underdevelopment, this function cannot fulfill its role properly.

In fact, the two functions can be separated in two independent software. Professionals who use the editing features of PDF files are not usually interested in annotation features and those who use annotations extensively have little interest in PDF edition. We don’t need all the PDF editing tools, which we almost never use anyway. We need powerful tools to navigate easily across all annotations in all PDF files on a device, no matter where the files are located. So, a PDF Editor is not a PDF Annotator.

Now, for annotation, most professional software offers features for Sorting, Summarizing and Searching annotations. These features are not well implemented. They are not powerful enough, not well organized, not well integrated. In short, they are not the central features of any PDF software known to me because annotations are not the focus of these software.

Usually, Sorting, Summarizing and Searching only apply to one document at a time. It is not possible to sort and summarize the annotations of a bunch of PDF files, for example all PDF files in a given directory. Many software offers advanced search features, such as searching across multiple files, but the results are awkwardly displayed and they can’t be sorted or summarized. This an example of features not well integrated. You can sort and summarize within a single file, you can search across multiple files, but you can’t do both.

So, we badly need a software to manage PDF annotations. A software that focus on annotations NOT on PDF files as a whole. The main object of a PDF Annotator is not a PDF file but a PDF annotation. The main window of a PDF Annotator has to display PDF annotations not PDF files as is the case with all PDF Viewer or Editor.

From now on:

- I will use the index card metaphor but do not worry I will not push the metaphor too far.

- I will focus on one annotation type: Highlight.

- I will give examples from PDF-XChange Editor since this is the professional PDF software I’m using

- I will discuss Qiqqa since it is the software that comes closest to a real PDF Annotator. But there are major flaws in the way Qiqqa has implemented the annotation feature.

- I will use the vocabulary from “JavaScript for Acrobat API Reference”, April 2007 (http://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/js_api_reference.pdf).

- In what follows, we must keep in mind that many people are struggling with hundreds of PDF files that contain dozens of annotations.

There are many annotation types: Circle, FreeText, Line, etc. Maybe the most popular type is Highlight, at least in the scientific community. Each annotation type possesses some properties, many of which are common to several types and some of which are common to all types. For example, “author”, “content”, “subject” and “creationDate” are common to all annotation types. There are also properties for an entire document, such as “authors”, “creationDate”, “keywords”, “subject” and “title” which can be very useful even if an entire document is not the main object of interest. The “content” of an annotation (usually called a comment) is the text you put in the pop-up window associated with the annotation. When you highlight a portion of text you can add a comment in the pop-up window. Notice that most PDF software talk about Comments instead of Annotations.

The most important property for handling annotations (and files) is the "subject" property. As before with the old index cards, we need to create topics for our annotations so we can easily classify and find them. It is then normal to use the "subject" property to do this. Most PDF Editors allow you to write topics in the "subject" properties. But, as implemented in most software, if I write many topics in the “subject” property (let's say separated by semicolons), usually those software are unable to Sort, Summarize and Search a particular topic. For example, I enter the following topics in the “subject” property of a given annotation, an Highlight annotation in my case:

     scientific method; explanation; scientific truth; testability; Karl Popper

With XChange Editor, if I sort my subjects, the sorting will be applied to the first word in the “subject” property, in this case ‘scientific’, in fact the whole string is considered a single word and if I summarize the annotations by subject, the same problem occurs, the whole string is considered a single word, not to mention that both features can only be applied to one file at a time.

Searching is a bit more powerful, at least in XChange Editor, since the Search Pane has many options to do advanced search. Contrary to the Sort and Summarize features, the whole string is NOT considered a single word. So, in the example above, I can search for “testability” only.

Besides the classic full text search, I like to be able to search within “subject” and “content” property of ALL annotations in ALL PDF files in a folder and subfolders. But since the PDF files are not indexed the search can take a very long time if you have 5 gigabytes of PDF files (which is my case), especially if the search is a full text search.

Also, the results are displayed in the already overcrowded Search Pane. The really good point is when you click on a given result the corresponding PDF file opens quickly to the right place. For example, if I do a search solely within “subject” and “content” property of all annotations in all my PDF files (not a full text search), each line in the result window in the Search Pane points to an annotation in a specific document. If I click a line, the corresponding PDF file opens where the text is highlighted. It’s good but not enough.

So, I think Searching is more powerful than Sorting and Summarizing in most PDF software. Which is why I’m using it much more than Sorting and Summarizing. However, the Search Pane in XChange Editor is all the same not a full fledge annotations management system. For example, I’m not able to have a list of all the subjects I created in the “subject” property of all my annotations!! From the example above, I’d like to have access to something like that in an alphabetical order:


               Karl Popper

               scientific method

               scientific truth


Right now, I have to keep track manually of all the subjects I created, which is an impossible task. I recall that I talk about 5 gigabytes of PDF files or hundreds of PDF files and thousands of annotations (mostly Highlight).

Now, let’s forget the Sort, Summarize and Search features as implemented by most professional PDF software. Try to think otherwise. Think about a new paradigm to manage annotations.

As I said before, the focus is on annotation not a whole PDF file and we need a way to access all annotations in all PDF files in a device based on subjects created by the users.

So, we need two main windows: a classical window or Document Window to display a PDF file (or many PDF files) and a Annotation Window to display annotations (think about index cards) based on selected subjects. As with the Document Window, it has to be possible to open many tabs in order to work on many subjects or projects on the same time.

The subjects can be listed in a Subject Pane. So the software has to maintain a subject database. From there we can select many subjects at the same time and the corresponding annotations (from all PDF files in a device) will be displayed in an Annotation Window. Of course, the Subject Pane should provide subjects management tools as renaming a subject across all annotations or across a subset of annotations or a subset of PDF files. Many more ideas come easily to mind.

When reading and annotating a PDF file, the GUI has to offer an easy way to enter or create subjects. Right now, with XChange Editor we have to open the Property Pane to get access to the “subject” property of an annotation. It has to be possible to enter or create a subject (or many subjects) directly in the pop-up window just as we fill the “content” property (usually called a comment) in the pop-up window. In addition, the field used to enter or to create a subject has to display a drop-down list of all the subjects already created so that it is possible to select topics that already exist.

It is not necessary, I think, to integrate a kind of project manager. A user has only to create special subjects for his projects. For example, the user can begin all his project names by “Project” followed by a name or a number. For example, he can enter “Project-001” or “Project-Popper and its enemies” in the “subject” property of an annotations.

Also, it is not necessary, I think, to force the user to put all is PDF files in a special folder manage by the PDF Annotator. People already manage PDF files in different ways. Most users already work with a reference manager or a sync service. Anyway, since the focus is on annotations, a PDF Annotator has only to scan a device for PDF files and annotations within those files. It does not matter if the file changes name or location, the software needs only to constantly scan the device for PDF annotations and update his subject database (and his index database for full text search).

If a user uses multiple devices, there is no need, I think, to create another synchronization service, which proliferates like plague on the Web. No matter how a user synchronizes his PDF files (Google Drive, OneDrive, Drop Box, Resilio, etc.), each copy of a PDF Annotator installed on a device works independently from each other and scans only the device on which it is installed.

Yes, I know, there is a software called PDF Annotator (www.pdfannotator.com). In the best case, it seems to be a PDF Presentator and in the worst case it is a toy not a tool.

Now let's talk about Qiqqa.

Qiqqa is the closest thing I saw which can be called a full fledge PDF Annotator. But there are major flaws. Take note that Qiqqa talk about tags, which is the same as the “subject” property of a PDF file.

The biggest flaw is that the main features are not PFD compliant. Annotations and tags (“subject” property) do not comply with ISO standards for PDF format. So, if Qiqqa goes bankrupt you are in big troubles. And even if Qiqqa does not go bankrupt, it's impossible to work on a device that does not have Qiqqa installed, which means you can not collaborate with colleagues who do not use Qiqqa.

Qiqqa has the right idea to focus on annotations and to use tags (“subject” property) to manage annotations (and PDF files), which is quite powerful, but the main window displays PDF files, based on selected tags (“subject” property). To see annotations based on selected tags, you have to generate a report, which is awkward. But Qiqqa is still quite powerful since you can click on an annotation in the report to open the PDF file in the right place. But, it is highly preferable, as I explained above, to have windows (or tabs) which dynamically display annotations based on selected subjects (tags in Qiqqa jargon).

Another big mistake, Qiqqa try to be a big software which can manage all the workflow of a research project. For example, it integrates a reference management system, tools to discover new papers to read and where to focus your efforts, the Expedition tool which automatically breaks your library into themes so that you can quickly get up to speed with and understand your field of research. This may be good ideas, I am not sure, but what is certain is that these functions should not interfere with the main functionality which is the management of annotations.

I hope I was clear enough. But above all, I hope someone will find these ideas interesting and potentially quite beneficial in terms of money to get into programming such a software.

I am ready to be an alpha tester or an early beta tester.




Link to comment
Share on other sites

2 replies to this suggestion

Recommended Posts


This topic is now archived and is closed to further replies.

  • Create New...

Important Information

By using this site, you agree to our Terms of Use.