In Qt < 5.10 (and also sometimes on Windows), we get extra spaces or newlines
when moving to the end of the document. However, this only happens *sometimes*,
and manual testing confirms that with the current workaround, we actually lose
the last char in the selection.
I'm not sure what's happening there, but instead of making things worse with
the workaround, let's just be a bit less strict with the checking there and
accept both variants... This seems like some Chromium bug we can't do much
about.
Resolves#4199.
To avoid accidentally highlighting characters that were introduced by
html escaping the text before feeding it to setHtml, we can't just
escape the whole string before adding the highlighting. Instead, we need
to break the string up on the pattern, format and escape the individual
parts, then join them back together.
re.escape includes empty strings if there is a match at the start/end,
which ensures that matches always land on odd indices:
https://docs.python.org/3/library/re.html#re.split
> If there are capturing groups in the separator and it matches at the
> start of the string, the result will start with an empty string. The
> same holds for the end of the string
Resolves the example case in #4199, but not the larger problem. We don't
need to escape quotes as we don't put the string in an attribute value.
From the docs at
https://docs.python.org/3/library/html.html#html.escape:
> If the optional flag quote is true, the characters (") and (') are also
> translated; this helps for inclusion in an HTML attribute value
> delimited by quotes, as in <a href="...">.
Escaping quotes means we end up with a literal ' in the completion
view wherever there is a quote in the source text.
However, problem in #4199, where unexpected parts of the text are
highlighted, can also happen with '<', '>', and '&', which still must be
escaped.
There were no unit tests for this whole module. It is difficult to test
due to all the private logic and Qt dependencies, but with a lot of
mocking we can at least validate some of the text handling.
This is a setup to start testing the solution to #4199.
I picked '{' and '}' as placeholders in the test data because they draw
the eye to the 'highlighted' part, and vim even highlights them with
python syntax highlighting. It could be confusing though, as they look
like format strings but are not used that way.
When we click the download button in PDF.js, it downloads a blob://qute:...
URL. We can detect that and force a download rather than opening it in PDF.js
again.
Note that what actually happens depends on the Qt version and backend:
QtWebKit (any Qt version):
Downloads always work properly (regardless of Qt version).
QtWebEngine, Qt 5.7.1:
Downloads work.
QtWebEngine, Qt 5.9 - 5.11:
Downloads won't work as we need to tell PDF.js to not use blob: URLs:
https://bugreports.qt.io/browse/QTBUG-70420 - in theory, PDF.js could fall back
to downloading the existing qute:// URL, but it has a whitelist of schemes
which does not include qute://... Since it's not in that whitelist, it just
ends up doing nothing at all.
QtWebEngine, Qt 5.12:
Downloads should hopefully work properly again, as we can register the qute://
scheme with Chromium, which allows us to use blob:// URLs.