2019年9月16日 星期一

用Python 模組 youtube-dl 下載 Youtube 直播影片

這一陣子因為有下載Youtube上公共電視台直播的需求,
爬文找解決方案。

剛好有在學Python,
所以就直接運用。

準備程序:
安裝Python,

安裝的過程請大家直接爬文,這部份網路上的教學太多了。

我是安裝miniconda。

以下是在miniconda的環境下來操作:

安裝 youtube-dl 和 ffmpeg

pip install youtube-dl

pip install ffmpeg


youtube-dl 模組支援的網站非常的多,請參考官方文件:
https://github.com/ytdl-org/youtube-dl/blob/master/docs/supportedsites.md

youtube-dl 的指令選項蠻多的,
鍵入youtube-dl --help後,可以查詢各項指令。

官方參考指令網址:
https://github.com/ytdl-org/youtube-dl/blob/master/README.md#options


下載 Youtube 影片:
youtube-dl https://www.youtube.com/watch?v=19g6IHsfsyY

執行結果如下:
(base) C:\Users\superuser>youtube-dl https://www.youtube.com/watch?v=19g6IHsfsyY
[youtube] 19g6IHsfsyY: Downloading webpage
[youtube] 19g6IHsfsyY: Downloading video info webpage
[download] Destination: 2017「未來與希望系列講座」5_27嚴長壽-19g6IHsfsyY.mp4
[download] 100% of 382.05MiB in 00:35



下載Youtube直播程序:
下載Youtube直播程序和上面的操作一樣,但麻煩的是下載不會自行終止。

所以最好做一個下載的批次執行檔,在windows 的工作排程中預約執行。
再做一個一個強制終止的批次執行檔,在直播結束時預約執行。

備註:存檔名抓系統年月日,在C碟的Mynideo目錄中自動存檔。

下載批次執行檔內容如下(存成auto_record.bat):
@echo off

set x=%date:~0,4%%date:~5,2%%date:~8,2%


CD /d C:\Myvideo
youtube-dl -o 公視台語台-%x%.%%(ext)s https://youtu.be/sHVv4H_mva0


強制終止的批次執行檔內容如下(存成stop_record.bat):
@echo off

taskkill /FI "IMAGENAME eq cmd.exe"


-------------------------------------------------------------------------------------

補充說明:
在jupyter notebook中執行的套件是 youtube_dl。

安裝指令:
pip install youtube_dl


安裝完成之後要重啟kernel
鍵入help (youtube_dl),按執行可以查詢指令

Help on package youtube_dl:

NAME
    youtube_dl - # coding: utf-8

PACKAGE CONTENTS
    YoutubeDL
    __main__
    aes
    cache
    compat
    downloader (package)
    extractor (package)
    jsinterp
    options
    postprocessor (package)
    socks
    swfinterp
    update
    utils
    version

CLASSES
    builtins.object
        youtube_dl.YoutubeDL.YoutubeDL
    
    class YoutubeDL(builtins.object)
     |  YoutubeDL(params=None, auto_init=True)
     |  
     |  YoutubeDL class.
     |  
     |  YoutubeDL objects are the ones responsible of downloading the
     |  actual video file and writing it to disk if the user has requested
     |  it, among some other tasks. In most cases there should be one per
     |  program. As, given a video URL, the downloader doesn't know how to
     |  extract all the needed information, task that InfoExtractors do, it
     |  has to pass the URL to one of them.
     |  
     |  For this, YoutubeDL objects have a method that allows
     |  InfoExtractors to be registered in a given order. When it is passed
     |  a URL, the YoutubeDL object handles it to the first InfoExtractor it
     |  finds that reports being able to handle it. The InfoExtractor extracts
     |  all the information about the video or videos the URL refers to, and
     |  YoutubeDL process the extracted information, possibly using a File
     |  Downloader to download the video.
     |  
     |  YoutubeDL objects accept a lot of parameters. In order not to saturate
     |  the object constructor with arguments, it receives a dictionary of
     |  options instead. These options are available through the params
     |  attribute for the InfoExtractors to use. The YoutubeDL also
     |  registers itself as the downloader in charge for the InfoExtractors
     |  that are added to it, so this is a "mutual registration".
     |  
     |  Available options:
     |  
     |  username:          Username for authentication purposes.
     |  password:          Password for authentication purposes.
     |  videopassword:     Password for accessing a video.
     |  ap_mso:            Adobe Pass multiple-system operator identifier.
     |  ap_username:       Multiple-system operator account username.
     |  ap_password:       Multiple-system operator account password.
     |  usenetrc:          Use netrc for authentication instead.
     |  verbose:           Print additional info to stdout.
     |  quiet:             Do not print messages to stdout.
     |  no_warnings:       Do not print out anything for warnings.
     |  forceurl:          Force printing final URL.
     |  forcetitle:        Force printing title.
     |  forceid:           Force printing ID.
     |  forcethumbnail:    Force printing thumbnail URL.
     |  forcedescription:  Force printing description.
     |  forcefilename:     Force printing final filename.
     |  forceduration:     Force printing duration.
     |  forcejson:         Force printing info_dict as JSON.
     |  dump_single_json:  Force printing the info_dict of the whole playlist
     |                     (or video) as a single JSON line.
     |  simulate:          Do not download the video files.
     |  format:            Video format code. See options.py for more information.
     |  outtmpl:           Template for output names.
     |  restrictfilenames: Do not allow "&" and spaces in file names
     |  ignoreerrors:      Do not stop on download errors.
     |  force_generic_extractor: Force downloader to use the generic extractor
     |  nooverwrites:      Prevent overwriting files.
     |  playliststart:     Playlist item to start at.
     |  playlistend:       Playlist item to end at.
     |  playlist_items:    Specific indices of playlist to download.
     |  playlistreverse:   Download playlist items in reverse order.
     |  playlistrandom:    Download playlist items in random order.
     |  matchtitle:        Download only matching titles.
     |  rejecttitle:       Reject downloads for matching titles.
     |  logger:            Log messages to a logging.Logger instance.
     |  logtostderr:       Log messages to stderr instead of stdout.
     |  writedescription:  Write the video description to a .description file
     |  writeinfojson:     Write the video description to a .info.json file
     |  writeannotations:  Write the video annotations to a .annotations.xml file
     |  writethumbnail:    Write the thumbnail image to a file
     |  write_all_thumbnails:  Write all thumbnail formats to files
     |  writesubtitles:    Write the video subtitles to a file
     |  writeautomaticsub: Write the automatically generated subtitles to a file
     |  allsubtitles:      Downloads all the subtitles of the video
     |                     (requires writesubtitles or writeautomaticsub)
     |  listsubtitles:     Lists all available subtitles for the video
     |  subtitlesformat:   The format code for subtitles
     |  subtitleslangs:    List of languages of the subtitles to download
     |  keepvideo:         Keep the video file after post-processing
     |  daterange:         A DateRange object, download only if the upload_date is in the range.
     |  skip_download:     Skip the actual download of the video file
     |  cachedir:          Location of the cache files in the filesystem.
     |                     False to disable filesystem cache.
     |  noplaylist:        Download single video instead of a playlist if in doubt.
     |  age_limit:         An integer representing the user's age in years.
     |                     Unsuitable videos for the given age are skipped.
     |  min_views:         An integer representing the minimum view count the video
     |                     must have in order to not be skipped.
     |                     Videos without view count information are always
     |                     downloaded. None for no limit.
     |  max_views:         An integer representing the maximum view count.
     |                     Videos that are more popular than that are not
     |                     downloaded.
     |                     Videos without view count information are always
     |                     downloaded. None for no limit.
     |  download_archive:  File name of a file where all downloads are recorded.
     |                     Videos already present in the file are not downloaded
     |                     again.
     |  cookiefile:        File name where cookies should be read from and dumped to.
     |  nocheckcertificate:Do not verify SSL certificates
     |  prefer_insecure:   Use HTTP instead of HTTPS to retrieve information.
     |                     At the moment, this is only supported by YouTube.
     |  proxy:             URL of the proxy server to use
     |  geo_verification_proxy:  URL of the proxy to use for IP address verification
     |                     on geo-restricted sites.
     |  socket_timeout:    Time to wait for unresponsive hosts, in seconds
     |  bidi_workaround:   Work around buggy terminals without bidirectional text
     |                     support, using fridibi
     |  debug_printtraffic:Print out sent and received HTTP traffic
     |  include_ads:       Download ads as well
     |  default_search:    Prepend this string if an input url is not valid.
     |                     'auto' for elaborate guessing
     |  encoding:          Use this encoding instead of the system-specified.
     |  extract_flat:      Do not resolve URLs, return the immediate result.
     |                     Pass in 'in_playlist' to only show this behavior for
     |                     playlist items.
     |  postprocessors:    A list of dictionaries, each with an entry
     |                     * key:  The name of the postprocessor. See
     |                             youtube_dl/postprocessor/__init__.py for a list.
     |                     as well as any further keyword arguments for the
     |                     postprocessor.
     |  progress_hooks:    A list of functions that get called on download
     |                     progress, with a dictionary with the entries
     |                     * status: One of "downloading", "error", or "finished".
     |                               Check this first and ignore unknown values.
     |  
     |                     If status is one of "downloading", or "finished", the
     |                     following properties may also be present:
     |                     * filename: The final filename (always present)
     |                     * tmpfilename: The filename we're currently writing to
     |                     * downloaded_bytes: Bytes on disk
     |                     * total_bytes: Size of the whole file, None if unknown
     |                     * total_bytes_estimate: Guess of the eventual file size,
     |                                             None if unavailable.
     |                     * elapsed: The number of seconds since download started.
     |                     * eta: The estimated time in seconds, None if unknown
     |                     * speed: The download speed in bytes/second, None if
     |                              unknown
     |                     * fragment_index: The counter of the currently
     |                                       downloaded video fragment.
     |                     * fragment_count: The number of fragments (= individual
     |                                       files that will be merged)
     |  
     |                     Progress hooks are guaranteed to be called at least once
     |                     (with status "finished") if the download is successful.
     |  merge_output_format: Extension to use when merging formats.
     |  fixup:             Automatically correct known faults of the file.
     |                     One of:
     |                     - "never": do nothing
     |                     - "warn": only emit a warning
     |                     - "detect_or_warn": check whether we can do anything
     |                                         about it, warn otherwise (default)
     |  source_address:    Client-side IP address to bind to.
     |  call_home:         Boolean, true iff we are allowed to contact the
     |                     youtube-dl servers for debugging.
     |  sleep_interval:    Number of seconds to sleep before each download when
     |                     used alone or a lower bound of a range for randomized
     |                     sleep before each download (minimum possible number
     |                     of seconds to sleep) when used along with
     |                     max_sleep_interval.
     |  max_sleep_interval:Upper bound of a range for randomized sleep before each
     |                     download (maximum possible number of seconds to sleep).
     |                     Must only be used along with sleep_interval.
     |                     Actual sleep time will be a random float from range
     |                     [sleep_interval; max_sleep_interval].
     |  listformats:       Print an overview of available video formats and exit.
     |  list_thumbnails:   Print a table of all thumbnails and exit.
     |  match_filter:      A function that gets called with the info_dict of
     |                     every video.
     |                     If it returns a message, the video is ignored.
     |                     If it returns None, the video is downloaded.
     |                     match_filter_func in utils.py is one example for this.
     |  no_color:          Do not emit color codes in output.
     |  geo_bypass:        Bypass geographic restriction via faking X-Forwarded-For
     |                     HTTP header
     |  geo_bypass_country:
     |                     Two-letter ISO 3166-2 country code that will be used for
     |                     explicit geographic restriction bypassing via faking
     |                     X-Forwarded-For HTTP header
     |  geo_bypass_ip_block:
     |                     IP range in CIDR notation that will be used similarly to
     |                     geo_bypass_country
     |  
     |  The following options determine which downloader is picked:
     |  external_downloader: Executable of the external downloader to call.
     |                     None or unset for standard (built-in) downloader.
     |  hls_prefer_native: Use the native HLS downloader instead of ffmpeg/avconv
     |                     if True, otherwise use ffmpeg/avconv if False, otherwise
     |                     use downloader suggested by extractor if None.
     |  
     |  The following parameters are not used by YoutubeDL itself, they are used by
     |  the downloader (see youtube_dl/downloader/common.py):
     |  nopart, updatetime, buffersize, ratelimit, min_filesize, max_filesize, test,
     |  noresizebuffer, retries, continuedl, noprogress, consoletitle,
     |  xattr_set_filesize, external_downloader_args, hls_use_mpegts,
     |  http_chunk_size.
     |  
     |  The following options are used by the post processors:
     |  prefer_ffmpeg:     If False, use avconv instead of ffmpeg if both are available,
     |                     otherwise prefer ffmpeg.
     |  ffmpeg_location:   Location of the ffmpeg/avconv binary; either the path
     |                     to the binary or its containing directory.
     |  postprocessor_args: A list of additional command-line arguments for the
     |                      postprocessor.
     |  
     |  The following options are used by the Youtube extractor:
     |  youtube_include_dash_manifest: If True (default), DASH manifests and related
     |                      data will be downloaded and processed by extractor.
     |                      You can reduce network I/O by disabling it if you don't
     |                      care about DASH.
     |  
     |  Methods defined here:
     |  
     |  __enter__(self)
     |  
     |  __exit__(self, *args)
     |  
     |  __init__(self, params=None, auto_init=True)
     |      Create a FileDownloader object with the given options.
     |  
     |  add_default_extra_info(self, ie_result, ie, url)
     |  
     |  add_default_info_extractors(self)
     |      Add the InfoExtractors returned by gen_extractors to the end of the list
     |  
     |  add_info_extractor(self, ie)
     |      Add an InfoExtractor object to the end of the list.
     |  
     |  add_post_processor(self, pp)
     |      Add a PostProcessor object to the end of the chain.
     |  
     |  add_progress_hook(self, ph)
     |      Add the progress hook (currently only for the file downloader)
     |  
     |  build_format_selector(self, format_spec)
     |  
     |  download(self, url_list)
     |      Download a given list of URLs.
     |  
     |  download_with_info_file(self, info_filename)
     |  
     |  encode(self, s)
     |  
     |  extract_info(self, url, download=True, ie_key=None, extra_info={}, process=True, force_generic_extractor=False)
     |      Returns a list with a dictionary for each video we find.
     |      If 'download', also downloads the videos.
     |      extra_info is a dict containing the extra values to add to each result
     |  
     |  get_encoding(self)
     |  
     |  get_info_extractor(self, ie_key)
     |      Get an instance of an IE with name ie_key, it will try to get one from
     |      the _ies list, if there's no instance it will create a new one and add
     |      it to the extractor list.
     |  
     |  in_download_archive(self, info_dict)
     |  
     |  list_formats(self, info_dict)
     |  
     |  list_subtitles(self, video_id, subtitles, name='subtitles')
     |  
     |  list_thumbnails(self, info_dict)
     |  
     |  post_process(self, filename, ie_info)
     |      Run all the postprocessors on the given file.
     |  
     |  prepare_filename(self, info_dict)
     |      Generate the output filename.
     |  
     |  print_debug_header(self)
     |  
     |  process_ie_result(self, ie_result, download=True, extra_info={})
     |      Take the result of the ie(may be modified) and resolve all unresolved
     |      references (URLs, playlist items).
     |      
     |      It will also download the videos if 'download'.
     |      Returns the resolved ie_result.
     |  
     |  process_info(self, info_dict)
     |      Process a single resolved IE result.
     |  
     |  process_subtitles(self, video_id, normal_subtitles, automatic_captions)
     |      Select the requested subtitles and their format
     |  
     |  process_video_result(self, info_dict, download=True)
     |  
     |  record_download_archive(self, info_dict)
     |  
     |  report_error(self, message, tb=None)
     |      Do the same as trouble, but prefixes the message with 'ERROR:', colored
     |      in red if stderr is a tty file.
     |  
     |  report_file_already_downloaded(self, file_name)
     |      Report file has already been fully downloaded.
     |  
     |  report_warning(self, message)
     |      Print the message to stderr, it will be prefixed with 'WARNING:'
     |      If stderr is a tty file the 'WARNING:' will be colored
     |  
     |  restore_console_title(self)
     |  
     |  save_console_title(self)
     |  
     |  to_console_title(self, message)
     |  
     |  to_screen(self, message, skip_eol=False)
     |      Print message to stdout if not in quiet mode.
     |  
     |  to_stderr(self, message)
     |      Print message to stderr.
     |  
     |  to_stdout(self, message, skip_eol=False, check_quiet=False)
     |      Print message to stdout if not in quiet mode.
     |  
     |  trouble(self, message=None, tb=None)
     |      Determine action to take when a download problem appears.
     |      
     |      Depending on if the downloader has been configured to ignore
     |      download errors or not, this method may throw an exception or
     |      not when errors are found, after printing the message.
     |      
     |      tb, if given, is additional traceback information.
     |  
     |  urlopen(self, req)
     |      Start an HTTP download
     |  
     |  warn_if_short_id(self, argv)
     |  
     |  ----------------------------------------------------------------------
     |  Static methods defined here:
     |  
     |  add_extra_info(info_dict, extra_info)
     |      Set the keys from extra_info in info dict if they are missing
     |  
     |  filter_requested_info(info_dict)
     |  
     |  format_resolution(format, default='unknown')
     |  
     |  ----------------------------------------------------------------------
     |  Data descriptors defined here:
     |  
     |  __dict__
     |      dictionary for instance variables (if defined)
     |  
     |  __weakref__
     |      list of weak references to the object (if defined)
     |  
     |  ----------------------------------------------------------------------
     |  Data and other attributes defined here:
     |  
     |  params = None

FUNCTIONS
    gen_extractors()
        Return a list of an instance of every supported extractor.
        The order does matter; the first extractor matched is the one handling the URL.
    
    list_extractors(age_limit)
        Return a list of extractors that are suitable for the given age,
        sorted by extractor ID.
    
    main(argv=None)

DATA
    __all__ = ['main', 'YoutubeDL', 'gen_extractors', 'list_extractors']
    __license__ = 'Public Domain'

FILE
    c:\users\superuser\miniconda3\lib\site-packages\youtube_dl\__init__.py

沒有留言:

【公告】網站遷移,未來內容將發表於新網站!!

 受限於blogger本身架構與限制,本網站所有內容已遷移至新網站,網址如下: https://kuo.us.to/wordpress/