Getting Started with Terminal: Install and Use wget

wget is a non-interactive command-line utility for download resources from a specified URL. Because it is non-interactive, wget can work in the background or before the user even logs in. The program was designed especially for poor connections, making it especially robust in otherwise flaky conditions. While wget isn’t shipped with macOS, it can be easily downloaded and installed with Homebrew, the best Mac package manager available.

1. Download and Install Homebrew

To install Homebrew, open a Terminal window and execute the following command taken from Homebrew’s website:

/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"

You might notice the command called curl, which is a different command-line utility for downloading files from a URL that ships within the Ruby installation included on macOS.

2. Installing wget

Once it has completed installing itself, we will use Homebrew to install wget. In Terminal, run the following command to download and install wget:

You’ll get live updates on the progress of downloading and installing whatever dependencies (software prerequisites) are required to run wget on your system.

If you already have Homebrew installed, be sure to run brew update to get the latest copies of all your formulae.

3. Using wget

The purpose of wget is downloading content from URLs. It’s a quick and simple non-interactive tool for downloading files from any publicly accessible URL.
Download a single file

Like the similar command curl, wget takes a remote resource from a URL and saves it to a specified location on your computer. The command’s structure works like so:

wget -O path/to/local.copy https://example.com/url/to/download.html

That will save the file specified in the URL to the location specified on your machine. If the -O flag is excluded, the specified URL will be downloaded to the present working directory.
Download a directory recursively

To download an entire directory tree with wget, you need to use the -r/–recursive and -np/–no-parent flags, like so:

wget -e robots=off -r -np https://www.w3.org/History/19921103-hypertext/hypertext/

This will cause wget to follow any links found on the documents within the specified directory, recursively downloading the entire specified URL path.

That command also includes -e robots=off, which ignores restrictions in the robots.txt file. In general, it’s a good idea to disable robots.txt to prevent abridged downloads.

Other wget Flags and Options

In addition to the flags above, this selected handful of wget’s flags are the most useful:

Controlling the download

-X /absolute/path/to/directory will exclude a specific directory on the remote server.
-nH removes the hostname directories. Remember, the hostname is the part of the URL that contains the domain name and ends in a TLD like “.com.” For example, the folder named “www.w3.org” in our previous example would be skipped, starting the download with the “History” directory instead.
--cut-dirs=# skips the specified number of directories down the URL before starting to download files. For example, -nH --cut-dirs=1 would change the specified path of “ftp.xemacs.org/pub/xemacs/” into simply “/xemacs/,” reducing the number of empty parent directories in the local download.
-R index.html/--reject index.html will skip any files matching the specified file name. In this case it will exclude all the index files. The * character can be used as a wildcard, like “*.png,” which would skip all files with the PNG extension.
-i file specifies target URLs from an input file. The input file must be an HTML file or be parsed as HTML with the additional flag --force-html
-nc/--no-clobber will not overwrite files that already exist in the destination.
-c/--continue will continue downloads of partially downloaded files.
-t 10 will try to download the resource up to 10 times before failing.

Adjusting the level of logging

-d enables debugging output.
-o path/to/log.txt enables logging output to the specified directory instead of displaying the log-in standard output.
-q turns off all of wget’s output, including error messages.
-v explicitly enables wget’s default of verbose output.
--no-verbose turns off log messages but displays error messages.

Conclusion

While that should cover the majority of wget use cases, the downloader is capable of much more. For a full description of wget’s capabilities, you can review wget’s GNU man page online.

You might also like the following posts:

Getting Started with Terminal: Install and Use wget

Table of Contents

1. Download and Install Homebrew

2. Installing wget

3. Using wget

Other wget Flags and Options

Controlling the download

Adjusting the level of logging

Conclusion

Pro Terminal Commands: Working with chflags in macOS

Getting Started with Terminal: Hide Files on macOS

Terminal Tips: Making Terminal More User-Friendly

Kokou Adzo

Leave a Reply Cancel reply

Related Posts

The Best Email Clients for macOS

MacBook Stuck on Loading Screen – A Human-Friendly Fix Guide

Fixing MacBook Screen Flickering: Causes and Solutions

How to Zoom on MacBook – A Complete Guide for Every User

MacBook Not Charging? Here’s What You Should Do

How to Copy and Paste on MacBook – Master Every Method with Ease

OUR STORY

WHO ARE WE

GET IN TOUCH

Email: contact@adzomedia.com

Phone: +33 7 69 49 25 08

Address: 2 rue de la Bourse, 75002 Paris, France

© 2024 All Rights Reserved