78 Commits (ca3e054750cacf479b05bd0af1ac0fa4eff2d124)

Author SHA1 Message Date
Philipp Hagemeister db1f388878 [huffpost] Add support 10 years ago
Jaime Marquínez Ferrándiz 944d65c762 [extractor/common] Encode the url when calculating the md5 with `—write-pages` option
This doesn’t cause any problem in python 2.*, but on python 3 the `md5` function only accepts bytes.
10 years ago
Philipp Hagemeister 1394ce65b4 [youtube] Add new formats (Fixes #2221) 10 years ago
Philipp Hagemeister 50317b111d Merge branch 'youtube-dash-manifest'
Conflicts:
	youtube_dl/extractor/youtube.py
10 years ago
Philipp Hagemeister 9d4288b2d4 [extractor/common] Clarify when and when not we generate the filename 10 years ago
Philipp Hagemeister b60016e831 Deal with implicitly UTF-16 decoded webpages
These webpages don't specify an encoding and rely on the BOM
10 years ago
Philipp Hagemeister dd27fd1739 [youtube] Download DASH manifest
If given, download and parse the DASH manifest file, in order to get ultra-HQ formats.
Fixes #2166
10 years ago
Philipp Hagemeister 3ec05685f7 [extractor/common] Limit --write-pages filename to 200 chars
This avoids problems with very long URLs.
10 years ago
Philipp Hagemeister 9933b57430 [pornhub] Use centralized sorting 10 years ago
Philipp Hagemeister 3d3538e422 [khanacademy] Add support (Fixes #2066) 10 years ago
Philipp Hagemeister 5d73273f6f [orf] Use new extraction method (Fixes #2057) 10 years ago
Philipp Hagemeister 9887c9b2d6 [jpopsuki] Simplify 11 years ago
Philipp Hagemeister 08d13955dd [wistia] Prefer original video format above all others
We could also set up a formula which would weigh filesize/bitrate and vcodec/acodec (say, 1GB h264 < 3 GB MPEG2 < 2 GB h264), but that would get really messy real soon.
11 years ago
Philipp Hagemeister 5d4f3985be Document that format_id field should be present 11 years ago
Philipp Hagemeister 7217e148fb [yahoo] Use centralized sorting, and add tbr field 11 years ago
Philipp Hagemeister c7deaa4c74 [zdf] Use centralized sorting 11 years ago
Philipp Hagemeister e6812ac99d [spiegel] Use centralized sorting 11 years ago
Philipp Hagemeister 4bcc7bd1f2 Add temporary _sort_formats helper function 11 years ago
Philipp Hagemeister f49d89ee04 Add a resolution field and improve general --list-formats output 11 years ago
Philipp Hagemeister f45f96f8f8 [myvideo] Use RTMP instead of RTMPT (Fixes #2032) 11 years ago
Philipp Hagemeister 1538eff6d8 [bliptv] Remove support for direct downloads
This is now handled by the generic IE
11 years ago
Philipp Hagemeister aa94a6d315 [aparat] Add support (Fixes #2012) 11 years ago
Jaime Marquínez Ferrándiz c0d0b01f0e [generic] Detect ooyala videos (fixes #2013) 11 years ago
Philipp Hagemeister 46374a56b2 [youtube] Do not warn for videos with allow_rating=0
This fixes #1982
Test video: http://www.youtube.com/watch?v=gi2uH3YxohU
11 years ago
Itay Brandes 87a28127d2 _search_regex's "isatty" call fails with Py2exe's
_search_regex calls the sys.stderr.isatty() function for unix systems.

Py2exe uses a custom Stderr() stream which doesn't have an `isatty()`
function, leading to it's crash.

Fixes easily with checking that it's a unix system first.
11 years ago
Philipp Hagemeister d67b0b1596 Reorder info_dict documentation 11 years ago
Philipp Hagemeister c0ba0f4859 Document duration field 11 years ago
Philipp Hagemeister e2b38da931 [mtv] Fixup incorrectly encoded XML documents 11 years ago
Philipp Hagemeister 7cc3570e53 Add fatal=False parameter to _download_* functions.
This allows us to simplify the calls in the youtube extractor even further.
11 years ago
Philipp Hagemeister 19e3dfc9f8 [9gag] Like/dislike count (#1895) 11 years ago
Philipp Hagemeister aaebed13a8 [smotri] Simplify 11 years ago
Philipp Hagemeister 2a275ab007 [zdf] Use _download_xml 11 years ago
Philipp Hagemeister 79d09f47c2 Merge branch 'opener-to-ydl' 11 years ago
Philipp Hagemeister c059bdd432 Remove quality_name field and improve zdf extractor 11 years ago
Philipp Hagemeister 02dbf93f0e [zdf/common] Use API in ZDF extractor.
This also comes with a lot of extra format fields
Fixes #1518
11 years ago
Philipp Hagemeister e03db0a077 Merge branch 'master' into opener-to-ydl 11 years ago
Jaime Marquínez Ferrándiz 267ed0c5d3 [collegehumor] Encode the xml before calling xml.etree.ElementTree.fromstring (fixes #1822)
Uses a new helper method in InfoExtractor: _download_xml
11 years ago
Philipp Hagemeister 7012b23c94 Match --download-archive during playlist processing (Fixes #1745) 11 years ago
Philipp Hagemeister dca0872056 Move the opener to the YoutubeDL object.
This is the first step towards being able to just import youtube_dl and start using it.
Apart from removing global state, this would fix problems like #1805.
11 years ago
Philipp Hagemeister 5904088811 Add support for tou.tv (Fixes #1792) 11 years ago
Philipp Hagemeister 91c7271aab Add automatic generation of format note based on bitrate and codecs 11 years ago
Jaime Marquínez Ferrándiz 78fb87b283 Don't accept '>' inside the content attribute in OpenGraph regexes 11 years ago
Jaime Marquínez Ferrándiz ab2d524780 Improve the OpenGraph regex
* Do not accept '>' between the property and content attributes.
* Recognize the properties if the content attribute is before the property attribute using two regexes (fixes the extraction of the description for SlideshareIE).
11 years ago
Philipp Hagemeister eb0a839866 [common] Simplify og_search_property 11 years ago
Marcin Cieślak a8eeb0597b Fix AssertionError when og property not found
On tvp.pl some webpages contain OpenGraph
metadata and some don't.

If og property is not found, _og_search_description
fails with

WARNING: unable to extract OpenGraph description; please report this issue on http://yt-dl.org/bug
Traceback (most recent call last):
  File "/usr/home/saper/bin/youtube-dl", line 18, in <module>
    youtube_dl.main()
  File "/usr/home/saper/sw/youtube-dl/youtube_dl/__init__.py", line 766, in main
    _real_main(argv)
  File "/usr/home/saper/sw/youtube-dl/youtube_dl/__init__.py", line 719, in _real_main
    retcode = ydl.download(all_urls)
  File "/usr/home/saper/sw/youtube-dl/youtube_dl/YoutubeDL.py", line 715, in download
    videos = self.extract_info(url)
  File "/usr/home/saper/sw/youtube-dl/youtube_dl/YoutubeDL.py", line 348, in extract_info
    ie_result = ie.extract(url)
  File "/usr/home/saper/sw/youtube-dl/youtube_dl/extractor/common.py", line 125, in extract
    return self._real_extract(url)
  File "/usr/home/saper/sw/youtube-dl/youtube_dl/extractor/tvp.py", line 56, in _real_extract
    info['description'] = self._og_search_description(webpage)
  File "/usr/home/saper/sw/youtube-dl/youtube_dl/extractor/common.py", line 331, in _og_search_description
    return self._og_search_property('description', html, fatal=False, **kargs)
  File "/usr/home/saper/sw/youtube-dl/youtube_dl/extractor/common.py", line 325, in _og_search_property
    return unescapeHTML(escaped)
  File "/usr/home/saper/sw/youtube-dl/youtube_dl/utils.py", line 494, in unescapeHTML
    assert type(s) == type(u'')
AssertionError

The patch allows me to use:

  try:
    info['description'] = self._og_search_description(webpage)
    info['thumbnail'] = self._og_search_thumbnail(webpage)
  except RegexNotFoundError:
    pass
11 years ago
Jaime Marquínez Ferrándiz 9103bbc5cd Add the 'webpage_url' field to info_dict
The url for the video page, it must allow to reproduce the result.
It's automatically set by YoutubeDL if it's missing.
11 years ago
Philipp Hagemeister b5d0d817bc Remove superfluous space 11 years ago
Philipp Hagemeister ebc14f251c Merge remote-tracking branch 'origin/master' 11 years ago
Philipp Hagemeister d41e6efc85 New debug option --write-pages 11 years ago
Filippo Valsorda 8ffa13e03e [Instagram] get the non-https link, as they are serving Akamai cert from a instagram.com domain 11 years ago