Details
-
Suggestion
-
Resolution: Unresolved
-
Not Evaluated
-
None
-
None
-
None
Description
So far if you have a document containing accented characters, you can't find them by searching with unaccented characters. (Searching for "far" doesn't find "får", for example. TBH I don't know how common it is that such applications should find it that way. For example macOS Preview doesn't. Maybe that's a lousy example anyway, because å is not just an accented letter, it's a different letter of the alphabet in Norwegian.) Apparently we have a customer asking for something like that; and they found this block of code in qtwebengine/src/3rdparty/chromium/pdf/pdfium/pdfium_engine.cc:
void PDFiumEngine::StartFind(const std::string& text, bool case_sensitive) { ... // Don't use PDFium to search for now, since it doesn't support unicode // text. Leave the code for now to avoid bit-rot, in case it's fixed later. // The extra parens suppress a -Wunreachable-code warning. if ((false)) { SearchUsingPDFium(str, case_sensitive, first_search, character_to_start_searching_from, current_page); } else { SearchUsingICU(str, case_sensitive, first_search, character_to_start_searching_from, current_page); }
So there's a request to have such a feature in QtPDF.
So far we only use the designated public API in src/3rdparty/chromium/third_party/pdfium/public. Adding a breakpoint and/or a cout in PDFiumEngine::StartFind() shows that it's not getting there when we do our search using FPDFText_FindStart().
The implementation rather seems to be in qtwebengine/src/3rdparty/chromium/third_party/pdfium/core/fpdftext/cpdf_textpagefind.cpp