Commit Graph

6232 Commits

Author SHA1 Message Date
Daniel
d1f8d29c20 Add --mhtml flag to :download
And remove :download-whole command.
2015-11-09 17:01:08 +01:00
Daniel
8cf0af004f Deprecate :download [url] [dest], add --dest param
:download --dest [dest] [url] is the new syntax.
2015-11-09 17:00:46 +01:00
Daniel
a898fd21d1 Update docs 2015-11-09 17:00:46 +01:00
Daniel
b17d74452f Expand $HOME before checking if file exists
Otherwise we might accidentally overwrite a file.
2015-11-09 16:59:16 +01:00
Daniel
dd8ff860f4 Fix lint 2015-11-09 16:59:16 +01:00
Daniel
b027e6af1b Mark cssutils tests as xfail on Python >= 3.5 2015-11-09 16:59:16 +01:00
Daniel
dab0db30a5 Remove tests for remove_file from test_mhtml.py 2015-11-09 16:59:16 +01:00
Daniel
919365dfa1 Remove dead code mhtml.py:MHTMLWriter:remove_file 2015-11-09 16:59:16 +01:00
Daniel
1902e4858f Also catch re.error on cssutils import
cssutils 1.0 and earlier are broken on Python 3.5 due to a bad regex
escape.
2015-11-09 16:59:16 +01:00
Daniel
957d68c477 Revert "Remove cssutils from mhtml.py"
This reverts commit 22a0f0952704d284846ab2572790d99a85515c57.
2015-11-09 16:59:16 +01:00
Daniel
ce1a99cc7c Remove cssutils from mhtml.py 2015-11-09 16:59:16 +01:00
Daniel
706b8c6600 Shorten line 2015-11-09 16:59:16 +01:00
Daniel
6601df14a3 mhtml: ask before overwriting dest 2015-11-09 16:59:16 +01:00
Daniel
420c087373 use cssutils 2015-11-09 16:59:16 +01:00
Daniel
749b1c02cc Style changes for mhtml and test_mhtml 2015-11-09 16:59:16 +01:00
Daniel
b05a0d191d Fix module path in test_mhtml
Also fix docstring for _get_css_imports
2015-11-09 16:59:16 +01:00
Daniel
2eeace1c2c Move misc.mhtml to browser.mhtml 2015-11-09 16:59:16 +01:00
Daniel
a092ef1fe6 String quote style changes
"" for user facing strings
'' for internal strings
except when quotes appear inside a string, to avoid escaping them
2015-11-09 16:59:16 +01:00
Daniel
9bf9124324 Fix mhtml tests, add test for _NoCloseBytesIO 2015-11-09 16:59:16 +01:00
Daniel
366916a8bf Use more specific selectors to filter webelements 2015-11-09 16:59:16 +01:00
Daniel
bf90c8c06b Add tests for mhtml
This also makes the output of MHTMLWriter deterministic, by

1) Setting the boundary at object creation, allowing uuid.uuid4 to be
   monkey patched

2) Outputting the files in sorted order (sorted by location), as python
   dicts are unordered by default.
2015-11-09 16:59:16 +01:00
Daniel
5fcbc839bb Allow many spaces and tabs after @import in CSS 2015-11-09 16:59:16 +01:00
Daniel
afa2f339e6 mhtm: use downloads logger instead of misc 2015-11-09 16:59:16 +01:00
Daniel
cb477a2623 Decode headers with ISO-8859-1 instead of ASCII 2015-11-09 16:59:16 +01:00
Daniel
a63aed5965 Use email.encoders instead of own encoder function 2015-11-09 16:59:16 +01:00
Daniel
ba81332d45 _get_css_imports now works on strings only
This also means that it returns strings, making the calls to .decode
unneeded.
2015-11-09 16:59:16 +01:00
Daniel
d3a21927f2 Remove default values in MHTMLWriter.__init__ 2015-11-09 16:59:16 +01:00
Daniel
e5bfb9884b Use WebElementWrapper instead of QWebElement
* also don't derive from object
* also set the _used flag on _Downloader
2015-11-09 16:59:16 +01:00
Daniel
a3cc71e317 Don't from-import functions/classes 2015-11-09 16:59:16 +01:00
Daniel
83aee4fad5 Rename on_meta_data_change to on_meta_data_changed 2015-11-09 16:59:16 +01:00
Daniel
8593144fa7 Make _path_suggestion public 2015-11-09 16:59:16 +01:00
Daniel
f58f6f24ee Use email.mime instead of manually writing the msg 2015-11-09 16:59:16 +01:00
Daniel
64c74bde90 Fix pylint for _NoCloseBytesIO 2015-11-09 16:59:16 +01:00
Daniel
05cc4b9650 Change boundary
This version contains a sequence that is illegal in quoted-printable
and thus safe from accidentally appearing in a website.
2015-11-09 16:59:16 +01:00
Daniel
8eafa1a105 Also scan CSS in <style> tags and inline CSS
As both may contain external links too (@import, url(...))
2015-11-09 16:59:16 +01:00
Daniel
02c1fa1232 Save mhtml if no assets need to be downloaded 2015-11-09 16:59:16 +01:00
Daniel
991b6d4fc9 Remove urljoin import 2015-11-09 16:59:16 +01:00
Daniel
5c6b715720 Use QUrl.resolved instead of urlparse.urljoin 2015-11-09 16:59:16 +01:00
Daniel
11ed60620a Also load assets referenced in css files
Things like "@import stylesheet.css" and "url(...)".
2015-11-09 16:59:16 +01:00
Daniel
6b086d159d Ask for filename when none is given 2015-11-09 16:59:16 +01:00
Daniel
679ab65b5f Message on finished download 2015-11-09 16:59:16 +01:00
Daniel
fd7820ea16 occurs -> occurred 2015-11-09 16:59:16 +01:00
Daniel
111feebf89 Refactor start_download to a class 2015-11-09 16:59:16 +01:00
Daniel
49a32f0041 First round of lint fixes 2015-11-09 16:59:16 +01:00
Daniel
024ae52366 Replaced quote-printable with own function
The original one had some inconsistencies that lead to bugs.

The content-type of the root document now also contains the charset.
2015-11-09 16:59:16 +01:00
Daniel
930871be01 First working version
The files can be opened with qutebrowser

Problems still with Umlauts in the encoded file.
2015-11-09 16:59:16 +01:00
Daniel
fbe5386e56 Initial version of website downloader
Saving websites as MHTML via :download-whole

Still needs some cleanup and a "ask for save path".
2015-11-09 16:59:16 +01:00
Florian Bruhin
99e090db78 tox: Update werkzeug to 0.11.
Version 0.11
------------

Released on November 8th 2015, codename Gleisbaumaschine.

- Added ``reloader_paths`` option to ``run_simple`` and other functions in
  ``werkzeug.serving``. This allows the user to completely override the Python
  module watching of Werkzeug with custom paths.
- Many custom cached properties of Werkzeug's classes are now subclasses of
  Python's ``property`` type.
- ``bind_to_environ`` now doesn't differentiate between implicit and explicit
  default port numbers in ``HTTP_HOST``.
- ``BuildErrors`` are now more informative. They come with a complete sentence
  as error message, and also provide suggestions.
- Fix a bug in the user agent parser where Safari's build number instead of
  version would be extracted.
- Fixed issue where RedisCache set_many was broken for twemproxy, which doesn't
  support the default MULTI command.
- ``mimetype`` parameters on request and response classes are now always
  converted to lowercase.
- Changed cache so that cache never expires if timeout is 0. This also fixes
  an issue with redis setex
- Werkzeug now assumes ``UTF-8`` as filesystem encoding on Unix if Python
  detected it as ASCII.
- New optional `has` method on caches.
- Fixed various bugs in `parse_options_header`.
- If the reloader is enabled the server will now open the socket in the parent
  process if this is possible.  This means that when the reloader kicks in
  the connection from client will wait instead of tearing down.  This does
  not work on all Python versions.
- Implemented PIN based authentication for the debugger.  This can optionally
  be disabled but is discouraged.  This change was necessary as it has been
  discovered that too many people run the debugger in production.
- Devserver no longer requires SSL module to be installed.

Version 0.10.5
--------------

(bugfix release, release date yet to be decided)

- Reloader: Correctly detect file changes made by moving temporary files over
  the original, which is e.g. the case with PyCharm.
- Fix bool behavior of ``werkzeug.datastructures.ETags`` under Python 3
2015-11-09 09:46:36 +01:00
Florian Bruhin
30db09bbda tox: Update pyroma to 1.8.3.
- Checking a PyPI package could fail under Python 3.
2015-11-09 09:46:02 +01:00
Florian Bruhin
0daf5885be Add some BDD tests for downloads. 2015-11-09 07:49:11 +01:00