Commit Graph

8328 Commits

Author SHA1 Message Date
Daniel
c2218f51cd Add mhtml.last_used_directory to vulture whitelist 2015-11-09 17:01:08 +01:00
Daniel
f34161423c Fix "line too long" 2015-11-09 17:01:08 +01:00
Daniel
a780325a3a Allow directories to be entered as destination
The filename will then default to 'page title.mht'
2015-11-09 17:01:08 +01:00
Daniel
ae8a9b8798 Handle non-ASCII in headers/url better 2015-11-09 17:01:08 +01:00
Daniel
8bb887ddab Specify window and tab instead of 'current' 2015-11-09 17:01:08 +01:00
Daniel
a1e0ccb787 Fix spelling/style. 2015-11-09 17:01:08 +01:00
Daniel
f2f9529af7 Remove sys import in test_mhtml 2015-11-09 17:01:08 +01:00
Daniel
ed8a6a4c7b Update to cssutils 1.0.1
This fixes cssutils on Python 3.5 (yay!).
2015-11-09 17:01:08 +01:00
Daniel
3a2bb2d348 Add cssutils to README and utils/version.py 2015-11-09 17:01:08 +01:00
Daniel
12a9deb9bc Fix lints 2015-11-09 17:01:08 +01:00
Daniel
d1f8d29c20 Add --mhtml flag to :download
And remove :download-whole command.
2015-11-09 17:01:08 +01:00
Daniel
8cf0af004f Deprecate :download [url] [dest], add --dest param
:download --dest [dest] [url] is the new syntax.
2015-11-09 17:00:46 +01:00
Daniel
a898fd21d1 Update docs 2015-11-09 17:00:46 +01:00
Daniel
b17d74452f Expand $HOME before checking if file exists
Otherwise we might accidentally overwrite a file.
2015-11-09 16:59:16 +01:00
Daniel
dd8ff860f4 Fix lint 2015-11-09 16:59:16 +01:00
Daniel
b027e6af1b Mark cssutils tests as xfail on Python >= 3.5 2015-11-09 16:59:16 +01:00
Daniel
dab0db30a5 Remove tests for remove_file from test_mhtml.py 2015-11-09 16:59:16 +01:00
Daniel
919365dfa1 Remove dead code mhtml.py:MHTMLWriter:remove_file 2015-11-09 16:59:16 +01:00
Daniel
1902e4858f Also catch re.error on cssutils import
cssutils 1.0 and earlier are broken on Python 3.5 due to a bad regex
escape.
2015-11-09 16:59:16 +01:00
Daniel
957d68c477 Revert "Remove cssutils from mhtml.py"
This reverts commit 22a0f0952704d284846ab2572790d99a85515c57.
2015-11-09 16:59:16 +01:00
Daniel
ce1a99cc7c Remove cssutils from mhtml.py 2015-11-09 16:59:16 +01:00
Daniel
706b8c6600 Shorten line 2015-11-09 16:59:16 +01:00
Daniel
6601df14a3 mhtml: ask before overwriting dest 2015-11-09 16:59:16 +01:00
Daniel
420c087373 use cssutils 2015-11-09 16:59:16 +01:00
Daniel
749b1c02cc Style changes for mhtml and test_mhtml 2015-11-09 16:59:16 +01:00
Daniel
b05a0d191d Fix module path in test_mhtml
Also fix docstring for _get_css_imports
2015-11-09 16:59:16 +01:00
Daniel
2eeace1c2c Move misc.mhtml to browser.mhtml 2015-11-09 16:59:16 +01:00
Daniel
a092ef1fe6 String quote style changes
"" for user facing strings
'' for internal strings
except when quotes appear inside a string, to avoid escaping them
2015-11-09 16:59:16 +01:00
Daniel
9bf9124324 Fix mhtml tests, add test for _NoCloseBytesIO 2015-11-09 16:59:16 +01:00
Daniel
366916a8bf Use more specific selectors to filter webelements 2015-11-09 16:59:16 +01:00
Daniel
bf90c8c06b Add tests for mhtml
This also makes the output of MHTMLWriter deterministic, by

1) Setting the boundary at object creation, allowing uuid.uuid4 to be
   monkey patched

2) Outputting the files in sorted order (sorted by location), as python
   dicts are unordered by default.
2015-11-09 16:59:16 +01:00
Daniel
5fcbc839bb Allow many spaces and tabs after @import in CSS 2015-11-09 16:59:16 +01:00
Daniel
afa2f339e6 mhtm: use downloads logger instead of misc 2015-11-09 16:59:16 +01:00
Daniel
cb477a2623 Decode headers with ISO-8859-1 instead of ASCII 2015-11-09 16:59:16 +01:00
Daniel
a63aed5965 Use email.encoders instead of own encoder function 2015-11-09 16:59:16 +01:00
Daniel
ba81332d45 _get_css_imports now works on strings only
This also means that it returns strings, making the calls to .decode
unneeded.
2015-11-09 16:59:16 +01:00
Daniel
d3a21927f2 Remove default values in MHTMLWriter.__init__ 2015-11-09 16:59:16 +01:00
Daniel
e5bfb9884b Use WebElementWrapper instead of QWebElement
* also don't derive from object
* also set the _used flag on _Downloader
2015-11-09 16:59:16 +01:00
Daniel
a3cc71e317 Don't from-import functions/classes 2015-11-09 16:59:16 +01:00
Daniel
83aee4fad5 Rename on_meta_data_change to on_meta_data_changed 2015-11-09 16:59:16 +01:00
Daniel
8593144fa7 Make _path_suggestion public 2015-11-09 16:59:16 +01:00
Daniel
f58f6f24ee Use email.mime instead of manually writing the msg 2015-11-09 16:59:16 +01:00
Daniel
64c74bde90 Fix pylint for _NoCloseBytesIO 2015-11-09 16:59:16 +01:00
Daniel
05cc4b9650 Change boundary
This version contains a sequence that is illegal in quoted-printable
and thus safe from accidentally appearing in a website.
2015-11-09 16:59:16 +01:00
Daniel
8eafa1a105 Also scan CSS in <style> tags and inline CSS
As both may contain external links too (@import, url(...))
2015-11-09 16:59:16 +01:00
Daniel
02c1fa1232 Save mhtml if no assets need to be downloaded 2015-11-09 16:59:16 +01:00
Daniel
991b6d4fc9 Remove urljoin import 2015-11-09 16:59:16 +01:00
Daniel
5c6b715720 Use QUrl.resolved instead of urlparse.urljoin 2015-11-09 16:59:16 +01:00
Daniel
11ed60620a Also load assets referenced in css files
Things like "@import stylesheet.css" and "url(...)".
2015-11-09 16:59:16 +01:00
Daniel
6b086d159d Ask for filename when none is given 2015-11-09 16:59:16 +01:00