Extract Files From a .tar.gz or .tar.bz2 File on Linux

Tar files are compressed archives. You’ll encounter them frequently while using a Linux distribution like Ubuntu or even while using the terminal on macOS. Here’s how to extract—or untar—the contents of a tar file, also known as a tarball.

What Does .tar.gz and .tar.bz2 Mean?

Files that have a .tar.gz or a .tar.bz2 extension are compressed archive files. A file with just a .tar extension is uncompressed, but those will be very rare.

The .tar portion of the file extension stands for tape archive, and is the reason that both of these file types are called tar files. Tar files date all the way back to 1979 when the tar command was created to allow system administrators to archive files onto tape. Forty years later we are still using the tar command to extract tar files on to our hard drives. Someone somewhere is probably still using tar with tape.

The .gz or .bz2 extension suffix indicates that the archive has been compressed, using either the gzip or bzip2 compression algorithm. The tar command will work happily with both types of file, so it doesn’t matter which compression method was used—and it should be available everywhere you have a Bash shell. You just need to use the appropriate tar command line options.

Extracting Files from Tar Files

Let’s say you’ve downloaded two files of sheet music. One file is called ukulele_songs.tar.gz , the other is called guitar_songs.tar.bz2. These files are in the Downloads directory.

Two tar files in the downloads directory

Let’s extract the ukulele songs:

tar -xvzf ukulele_songs.tar.gz

As the files are extracted, they are listed in the terminal window.

Extraction of all files from tar file

The command line options we used are:

  • -x: Extract, retrieve the files from the tar file.
  • -v: Verbose, list the files as they are being extracted.
  • -z: Gzip, use gzip to decompress the tar file.
  • -f: File, the name of the tar file we want tar to work with. This option must be followed by the name of the tar file.

List the files in the directory with ls and you’ll see that a directory has been created called Ukulele Songs. The extracted files are in that directory. Where did this directory come from? It was contained in the tar file, and was extracted along with the files.

Ukulele Songs directory created in Downloads directory

Now let’s extract the guitar songs. To do this we’ll use almost exactly the same command as before but with one important difference. The .bz2 extension suffix tells us it has been compressed using the bzip2 command. Instead of using the-z (gzip) option, we will use the -j (bzip2) option.

tar -xvjf guitar_songs.tar.bz2

Extraction of guitar songs tar file in Downloads folder

Once again, the files are listed to the terminal as they are extracted. To be clear, the command line options we used with tar for the .tar.bz2 file were:

  • -x: Extract, retrieve the files from of the tar file.
  • -v: Verbose, list the files as they are being extracted.
  • -j: Bzip2, use bzip2 to decompress the tar file.
  • -f: File, name of the tar file we want tar to work with.

If we list the files in the Download directory we will see that another directory called Guitar Songs has been created.

Guitar songs directory created in Downloads directory

Choosing Where to Extract the Files To

If we want to extract the files to a location other than the current directory, we can specify a target directory using the -C (specified directory) option.

tar -xvjf guitar_songs.tar.gz -C ~/Documents/Songs/

Looking in our Documents/Songs directory we’ll see the Guitar Songs directory has been created.

Guitar songs directory created in Documents/Songs directory

Note that the target directory must already exist, tar will not create it if it is not present. If you need to create a directory and have tar extract the files into it all in one command, you can do that as follows:

mkdir -p ~/Documents/Songs/Downloaded && tar -xvjf guitar_songs.tar.gz -C ~/Documents/Songs/Downloaded/

The -p (parents) option causes mkdir to create any parent directories that are required, ensuring the target directory is created.

Looking Inside Tar Files Before Extracting Them

So far we’ve just taken a leap of faith and extracted the files sight unseen. You might like to look before you leap. You can review the contents of a tar file before you extract it by using the -t (list) option. It is usually convenient to pipe the output through the less command.

tar -tf ukulele_songs.tar.gz | less

Notice that we don’t need to use the -z option to list the files. We only need to add the -z option when we’re extracting files from a .tar.gz file. Likewise, we don’t need the -j option to list the files in a tar.bz2 file.

Contents of tar file piped through less

Scrolling through the output we can see that everything in the tar file is held within a directory called Ukulele Songs, and within that directory, there are files and other directories.

Second view of contents of tar file piped through less

We can see that the Ukulele Songs directory contains directories called Random Songs, Ramones and Possibles.

To extract all the files from a directory within a tar file use the following command. Note that the path is wrapped in quotation marks because there are spaces in the path.

tar -xvzf ukulele_songs.tar.gz "Ukulele Songs/Ramones/"

Extracting single folder from tar file

To extract a single file, provide the path and the name of the file.

tar -xvzf ukulele_songs.tar.gz "Ukulele Songs/023 - My Babe.odt"

Extracting single file from tar file

You can extract a selection of files by using wildcards, where * represents any string of characters and ? represents any single character. Using wildcards requires the use of the --wildcards option.

tar -xvz --wildcards -f ukulele_songs.tar.gz "Ukulele Songs/Possibles/B*"

Extracting songs from tar with wildcards

Extracting Files Without Extracting Directories

If you don’t want the directory structure in the tar file to be recreated on your hard drive, use the --strip-components option. The --strip-components option requires a numerical parameter. The number represents how many levels of directories to ignore. Files from the ignored directories are still extracted, but the directory structure is not replicated on your hard drive.

If we specify --strip-components=1 with our example tar file, the Ukulele Songs top-most directory within the tar file is not created on the hard drive. The files and directories that would have been extracted to that directory are extracted in the target directory.

tar -xvzf ukulele_songs.tar.gz --strip-components=1

Extracting files from tar file with --strip-components=1

There are only two levels of directory nesting within our example tar file. So if we use --strip-components=2, all the files are extracted in the target directory, and no other directories are created.

tar -xvzf ukulele_songs.tar.gz --strip-components=2

Extracting files from tar file with --strip-components=2

If you look at the Linux man page you’ll see that tar has got to be a good candidate for the title of “command having the most command line options.” Thankfully, to allow us to extract files from .tar.gz and tar.bz2 files with a good degree of granular control, we only need to remember a handful of these options.


Source : Howtogeek

Add a Comment