June 24, 2022

Decompress And Untar Multiple Gzipped files in Java

In the post GZIP Multiple Files in Java Creating Tar Archive we have already seen that you need to create a tar archive for multiple files and then compress it using GZIP. In this post we’ll see how to decompress multiple gzipped files in Java which first requires to decompress a gz file (.tar.gz) and later untar that tar ball.

Decompress gzipped files and untar archive in Java

Java program given here to unarchive multiple files compressed as .tar.gz uses Apache Commons Compress library which can be downloaded from this path- https://commons.apache.org/proper/commons-compress/download_compress.cgi

Version used here is commons-compress-1.18 so commons-compress-1.18.jar is added to the class path.

From Apache Commons Compress library following two files are used for decompressing gzip file and to untar a tar archive.

TarArchiveEntry- Represents an entry in a Tar archive.

TarArchiveInputStream- This class has method getNextTarEntry() to read archive entries from TarArchiveEntry. While looping these archive entries you check if it is a directory or file. If it is a directory just create that directory, in case of file read the content of the file and write it to output stream.

Java program – Decompress Gzip file and untar

import java.io.BufferedInputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.zip.GZIPInputStream;
import org.apache.commons.compress.archivers.tar.TarArchiveEntry;
import org.apache.commons.compress.archivers.tar.TarArchiveInputStream;
import org.apache.commons.compress.utils.IOUtils;

public class UntarArchive {
  public static void main(String[] args) {
    String COMPRESSED_FILE = "/home/knpcode/Documents/test.tar.gz";
    String DESTINATION_PATH = "/home/knpcode/Documents/";
    File destFile = new File(DESTINATION_PATH);
    unTarFile(COMPRESSED_FILE, destFile);
  }
  private static void unTarFile(String tarFile, File destFile) {
    TarArchiveInputStream tis = null;
    try {
      FileInputStream fis = new FileInputStream(tarFile);
      // .gz
      GZIPInputStream gzipInputStream = new GZIPInputStream(new BufferedInputStream(fis));
      //.tar.gz
      tis = new TarArchiveInputStream(gzipInputStream);
      TarArchiveEntry tarEntry = null;
      while ((tarEntry = tis.getNextTarEntry()) != null) {
        System.out.println(" tar entry- " + tarEntry.getName());
        if(tarEntry.isDirectory()){
          continue;
        }else {
          // In case entry is for file ensure parent directory is in place
          // and write file content to Output Stream
          File outputFile = new File(destFile + File.separator + tarEntry.getName());
          outputFile.getParentFile().mkdirs();	        
          IOUtils.copy(tis, new FileOutputStream(outputFile));
        }
      }      		
    }catch(IOException ex) {
      System.out.println("Error while untarring a file- " + ex.getMessage());
    }finally {
      if(tis != null) {
        try {
          tis.close();
        } catch (IOException e) {
          // TODO Auto-generated catch block
          e.printStackTrace();
        }
      }
    }	
  }
}

If you just want to untar a file that is you have to unarchive a .tar file not a .tar.gz file then you can skip creating GZIPInputStream.

Instead of this

FileInputStream fis = new FileInputStream(tarFile);
GZIPInputStream gzipInputStream = new GZIPInputStream(new BufferedInputStream(fis));
TarArchiveInputStream tis = new TarArchiveInputStream(gzipInputStream);

Have this instead

FileInputStream fis = new FileInputStream(tarFile);
TarArchiveInputStream tis = new TarArchiveInputStream(new BufferedInputStream(fis));

That's all for the topic Decompress And Untar Multiple Gzipped files in Java. If something is missing or you have something to share about the topic please write a comment.


You may also like

No comments:

Post a Comment