Java Condition Interface

Condition interface which resides in java.util.concurrent.locks has methods for inter thread communication similar to Object class monitor methods (wait, notify and notifyAll). Condition provides methods like await(), signal(), signalAll(). Where a Lock replaces the use of synchronized methods and statements, a Condition replaces the use of the Object monitor methods.

Some of the methods defined in java.util.concurrent.locks.Condition interface are given below.

  • await()- Causes the current thread to wait until it is signalled or interrupted.
  • await(long time, TimeUnit unit)- Causes the current thread to wait until it is signalled or interrupted, or the specified waiting time elapses.
  • awaitNanos(long nanosTimeout)- Causes the current thread to wait until it is signalled or interrupted, or the specified waiting time elapses.
  • awaitUninterruptibly()- Causes the current thread to wait until it is signalled.
  • awaitUntil(Date deadline)- Causes the current thread to wait until it is signalled or interrupted, or the specified deadline elapses.
  • signal()- Wakes up one waiting thread.
  • signalAll()- Wakes up all waiting threads.

How to get Condition instance

A Condition instance is intrinsically bound to a lock. To obtain a Condition instance for a particular Lock instance use its newCondition() method.

Example using Condition interface methods

Following producer consumer program uses the methods of the Condition interface for intercommunication between two threads.

In the example Consumer thread starts removing items from buffer only when buffer is full till then the Consumer thread is in wait state because of await() method.

package com.knpcode.proj.Programs;

import java.util.concurrent.locks.Condition;
import java.util.concurrent.locks.Lock;
import java.util.concurrent.locks.ReentrantLock;

public class ProduceConsume {
  public static void main(String[] args) {
    int capacity = 5;
    // shared object
    Buffer buffer = new Buffer(capacity);
    Thread t1 = new Thread(new Producer(buffer, capacity), "Producer");
    Thread t2 = new Thread(new Consumer(buffer, capacity), "Consumer");
    t1.start();
    t2.start(); 
  }

  // Producer class to add elements to buffer
  static class Producer implements Runnable{
    Buffer buffer;
    int capacity;
    Producer(Buffer buffer, int capacity){
      this.buffer = buffer;
      this.capacity = capacity;
    }
    @Override
    public void run() {
      for(int i = 1; i <= capacity; i++){
        try {
          buffer.put(i);
        } catch (InterruptedException e) {
          // TODO Auto-generated catch block
          e.printStackTrace();
        }
      }
    }
  }
  // Consumer class to remove elements from buffer
  static class Consumer implements Runnable{
    Buffer buffer;
    int capacity;
    Consumer(Buffer buffer, int capacity){
      this.buffer = buffer;
      this.capacity = capacity;
    }
    
    @Override
    public void run() {
      for(int i = 1; i <= capacity; i++){
        try {
          System.out.println("Item removed- " + buffer.take());
        } catch (InterruptedException e) {
          // TODO Auto-generated catch block
          e.printStackTrace();
        }
      }
    }
  }
	
  static class Buffer {
    private Object[] items;
    final Lock lock = new ReentrantLock();
    // Conditions
    final Condition notFull  = lock.newCondition(); 
    final Condition notEmpty = lock.newCondition(); 
    int putptr, takeptr, count;
    public Buffer(int capacity){
      items = new Object[capacity];
    }
		
    public void put(Object x) throws InterruptedException {
      lock.lock();
      try {
        while (count == items.length)
          notFull.await();
        items[putptr] = x;
        System.out.println("Putting- "+ x);
        if (++putptr == items.length) { 
          putptr = 0;
        }
        ++count;
        notEmpty.signal();
      } finally {
        lock.unlock();
      }
    }

    public Object take() throws InterruptedException {
      lock.lock();
      try {
        while (count == 0) {
          notEmpty.await();
        }
        Object item = items[takeptr];
        if (++takeptr == items.length) {
          takeptr = 0;
        }
        --count;
        notFull.signal();
        return item;
      } finally {
        lock.unlock();
      }
    }
  }
}
Output
Putting- 1
Putting- 2
Putting- 3
Putting- 4
Putting- 5
Item removed- 1
Item removed- 2
Item removed- 3
Item removed- 4
Item removed- 5

That's all for the topic Java Condition Interface. If something is missing or you have something to share about the topic please write a comment.


You may also like

Convert LocalDate to Date in Java

This post shows how to convert java.time.LocalDate to java.util.Date in Java.

For converting LocalDate to Date steps are as follows-

  1. Get the ZonedDateTime from the LocalDate by specifying the ZoneId.
  2. Convert that ZonedDateTime to Instant instance using toInstant() method.
  3. Pass instant to Date.from() method to get a java.util.Date instance.

If we have to write those steps elaborately then it can be done as follows-

LocalDate ld = LocalDate.now();
System.out.println("Local Date - " + ld);
ZonedDateTime zdt = ld.atStartOfDay(ZoneId.systemDefault());
Instant instant = zdt.toInstant();
Date date = Date.from(instant);
System.out.println("Date- " + date);
Output
Local Date - 2019-11-20
Date- Wed Nov 20 00:00:00 IST 2019

You can also do it in one line as given below-

LocalDate ld = LocalDate.now();
Date date = Date.from(ld.atStartOfDay(ZoneId.systemDefault()).toInstant());

That's all for the topic Convert LocalDate to Date in Java. If something is missing or you have something to share about the topic please write a comment.


You may also like

Compare Dates in Java

In this post there are Java programs to compare dates in Java, options you have are using Date class methods, Calendar class methods and from Java 8 using methods in LocalDate, LocalTime and LocalDateTime classes.

Comparing java.util.Date

If you have two Date instances and you want to compare them then the methods in the Date class that can be used are-

  • compareTo(Date anotherDate)- Compares two Dates for ordering. Returns 0 if the argument Date is equal to this Date; a value less than 0 if this Date is before the passed Date argument; and a value greater than 0 if this Date is after the Date argument.
  • equals(Object obj)- Compares two dates for equality. The result is true if and only if the argument is not null and is a Date object that represents the same point in time, to the millisecond, as this object.
  • after(Date when)- Tests if this date is after the specified date. Returns true if the instant represented by this Date object is strictly later than the instant represented by when; false otherwise.
  • before(Date when)- Tests if this date is before the specified date. Returns true if and only if the instant of time represented by this Date object is strictly earlier than the instant represented by when; false otherwise.
public class CompareDates {
  public static void main(String[] args) throws ParseException {
    SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd");
    Date dt1 = sdf.parse("2019-10-27");
    Date dt2 = sdf.parse("2019-08-20");
    System.out.println("Date1 is- "+ sdf.format(dt1));
    System.out.println("Date2 is- "+ sdf.format(dt2));
    compareDates(dt1, dt2);
  }
	
  private static void compareDates(Date dt1, Date dt2) {
    if(dt1.compareTo(dt2) > 0) {
      System.out.println("Date1 comes after date2");
    }else if(dt1.compareTo(dt2) < 0) {
      System.out.println("Date1 comes before date2");
    }else {
      System.out.println("Date1 equals date2");
    }
		
    // Using after method
    if(dt1.after(dt2)) {
      System.out.println("Date1 comes after date2");
    }else {
      System.out.println("Date1 comes before date2");
    }
		
    // Using before method
    if(dt1.before(dt2)) {
      System.out.println("Date1 comes before date2");
    }else {
      System.out.println("Date1 comes after date2");
    }
		
    //using equals method
    if(dt1.equals(dt2)) {
      System.out.println("Date1 equals date2");
    }else {
      System.out.println("Date1 is not equal to date2");
    }
  }
}
Output
Date1 is- 2019-10-27
Date2 is- 2019-08-20
Date1 comes after date2
Date1 comes after date2
Date1 comes after date2
Date1 is not equal to date2

Comparing java.util.Calendar

If you have Calendar instances then you can compare them the same way Date instances are compared. In Calendar class also there are similar methods compareTo, equals, after, before.

public class CompareDates {

  public static void main(String[] args) throws ParseException {

    SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd");
    Date dt1 = sdf.parse("2018-09-27");
    Date dt2 = sdf.parse("2019-08-20");
    Calendar cal1 = Calendar.getInstance();
    Calendar cal2 = Calendar.getInstance();
    cal1.setTime(dt1);
    cal2.setTime(dt2);
    System.out.println("Date1 is- "+ sdf.format(cal1.getTime()));
    System.out.println("Date2 is- "+ sdf.format(cal2.getTime()));
    compareDates(cal1, cal2);
  }
	
  // Comparing Calendar instances
  private static void compareDates(Calendar cal1, Calendar cal2) {
    if(cal1.compareTo(cal2) > 0) {
      System.out.println("Date1 comes after date2");
    }else if(cal1.compareTo(cal2) < 0) {
      System.out.println("Date1 comes before date2");
    }else {
      System.out.println("Date1 equals date2");
    }
    
    // Using after method
    if(cal1.after(cal2)) {
      System.out.println("Date1 comes after date2");
    }else {
      System.out.println("Date1 comes before date2");
    }
    
    // Using before method
    if(cal1.before(cal2)) {
      System.out.println("Date1 comes before date2");
    }else {
      System.out.println("Date1 comes after date2");
    }
    
    //using equals method
    if(cal1.equals(cal2)) {
      System.out.println("Date1 equals date2");
    }else {
      System.out.println("Date1 is not equal to date2");
    }
  }
}
Output
Date1 is- 2018-09-27
Date2 is- 2019-08-20
Date1 comes before date2
Date1 comes before date2
Date1 comes before date2
Date1 is not equal to date2

Comparing LocalDates in Java

Java 8 onward you can use classes in new Date and Time API for comparing dates in Java. Here is an example using LocalDate instances. Similar methods are there in LocalTime and LocalDateTime classes too. For comparing two LocalDate instances there are the following methods-
  • compareTo(ChronoLocalDate other)- Compares this date to another date. Returns the comparator value, negative if less, positive if greater.
  • isAfter(ChronoLocalDate other)- Checks if this date is after the specified date. Returns true if this date is after the specified date.
  • isBefore(ChronoLocalDate other)- Checks if this date is before the specified date. Returns true if this date is before the specified date.
  • isEqual(ChronoLocalDate other)- Checks if this date is equal to the specified date. Returns true if this date is equal to the specified date.
public class CompareDates {

  public static void main(String[] args) {
    LocalDate ld1 = LocalDate.of(2019, Month.OCTOBER, 18);
    LocalDate ld2 = LocalDate.of(2019, Month.SEPTEMBER, 20);
    System.out.println(ld1.compareTo(ld2));
    System.out.println(ld2.compareTo(ld1));
    
    System.out.println(ld1.isAfter(ld2));
    System.out.println(ld1.isBefore(ld2));
    System.out.println(ld1.isEqual(ld2));
  }
}
Output
1
-1
true
false
false

That's all for the topic Compare Dates in Java. If something is missing or you have something to share about the topic please write a comment.


You may also like

How to Split a String in Java

This post shows how you can split a String in Java using split() method. String you need to split may be delimited using pipe, tab or spaces so let’s see how to use split() method to split such delimited data strings in Java. Note that if you are using any special symbol with in the regular expression then you do need to escape it using escape character (\).

Splitting String delimited using pipe(|) symbol - Java code

public class SplitString {
  public static void main(String[] args) {
    String str = "A001|BOA|Ronda|5000";
    String[] data = str.split("\\|");
    System.out.println("Name- " + data[2]);
  }
}
Output
Name- Ronda

Splitting data in Java delimited using tab (\t)

public class SplitString {
  public static void main(String[] args) {
    String str = "A001	BOA	Ronda	5000";
    String[] data = str.split("\t");
    System.out.println("Amount- " + data[3]);
  }
}
Output
Amount- 5000

Splitting data delimited using spaces- Java code

public class SplitString {
  public static void main(String[] args) {
    String str = "A001  BOA Ronda 5000";
    // Matches any number of spaces
    String[] data = str.split("\\s+");
    System.out.println("Amount- " + data[3]);
  }
}
Output
Amount- 5000

Splitting data delimited using single space

public class SplitString {
  public static void main(String[] args) {
    String str = "A001 BOA Ronda 5000";
    // Matches any number of spaces
    String[] data = str.split("\\s");
    System.out.println("Name- " + data[2]);
  }
}
Output
Name- Ronda

That's all for the topic How to Split a String in Java. If something is missing or you have something to share about the topic please write a comment.


You may also like

Display Time in 24 Hour Format in Java

This post shows how to display time in 24 hour format in Java using SimpleDateFormat and DateTimeFormatter class (Java 8 onward).

Pattern for time in 24 hour format

In Java pattern for 24 hours are as follows-

  • H- Hour in day (0-23), will return 0-23 for hours.
  • k- Hour in day (1-24), will return 1-24 for hours.

As per your requirement for displaying time use the appropriate hour pattern.

Using SimpleDateFormat

Date date = new Date();
// Pattern 
SimpleDateFormat sdf = new SimpleDateFormat("HH:mm:ss");
System.out.println("Time in 24 Hour format - " + sdf.format(date));
Output
Time in 24 Hour format – 16:13:58

Here is another program which shows the difference between using ‘HH’ and ‘kk’ as an hour format.

import java.text.SimpleDateFormat;
import java.util.Calendar;
import java.util.Date;
import java.util.GregorianCalendar;

public class FormatDate {
  public static void main(String[] args) {
    Date date = new GregorianCalendar(2019, Calendar.SEPTEMBER, 15, 24, 20, 15).getTime();
    System.out.println("DateTime is- " + date);
    // Pattern 
    SimpleDateFormat sdf1 = new SimpleDateFormat("dd-MMM-yyyy kk:mm:ss");
    SimpleDateFormat sdf2 = new SimpleDateFormat("dd-MMM-yyyy HH:mm:ss");
    System.out.println("Time in 24 Hour format - " + sdf1.format(date));
    System.out.println("Time in 24 Hour format - " + sdf2.format(date));
  }
}
Output
DateTime is- Mon Sep 16 00:20:15 IST 2019
Time in 24 Hour format - 16-Sep-2019 24:20:15
Time in 24 Hour format - 16-Sep-2019 00:20:15

Using DateTimeFormatter

Java 8 onward you can use new date and time API classes like LocalTime for representing time and DateTimeFormatter for specifying pattern.

LocalTime time = LocalTime.now();
// Pattern 
DateTimeFormatter pattern = DateTimeFormatter.ofPattern("HH:mm:ss");
System.out.println("Time in 24 Hour format - " + time.format(pattern));
Output
Time in 24 Hour format - 16:28:08

That's all for the topic Display Time in 24 Hour Format in Java. If something is missing or you have something to share about the topic please write a comment.


You may also like

Getting Current Date and Time in Java

In this post we’ll see different ways to get current date and time in Java. Options you have are-

  1. java.util.Date
  2. java.util.Calendar
  3. java.time.LocalDate- To get date.
  4. java.time.LocalTime- To get time.
  5. java.time.LocalDateTime- To get both date and time.
  6. java.time.ZonedDateTime – If you want time-zone information too.

Out of these classes LocalDate, LocalTime, LocalDateTime and ZonedDateTime are classes in Java 8 new Date and Time API.

1. Getting Date and Time using java.util.Date

When you instantiate a Date object, it is initialized so that it represents the date and time at which it was allocated.

Date date = new Date();
System.out.println(date);
Output
Thu Oct 10 16:42:21 IST 2019
Using SimpleDateFormat you can format this date.
public class FormatDate {
  public static void main(String[] args) {
    Date date = new Date();
    SimpleDateFormat sdf = new SimpleDateFormat("dd/MM/yyyy hh:mm:ss.SSS");
    System.out.println(sdf.format(date));
  }
}
Output
10/10/2019 04:50:49.197

2. Getting Date and Time using java.util.Calendar

Using getInstance() static method of the Calendar class you can get an instance of Calendar.

public class FormatDate {
  public static void main(String[] args) {
    Calendar calendar = Calendar.getInstance();
    SimpleDateFormat sdf = new SimpleDateFormat("MMM-dd-yyyy hh:mm:ss");
    System.out.println(sdf.format(calendar.getTime()));
  }
}

3. Using java.time.LocalDate

LocalDate represents a date without time-zone in the ISO-8601 calendar system. Using now() method you can obtain the current date from the system clock in the default time-zone.

For formatting date you can use DateTimeFormatter class which is also added in Java 8.

public class FormatDate {
  public static void main(String[] args) {
    // get date
    LocalDate date = LocalDate.now();
    DateTimeFormatter formatter = DateTimeFormatter.ofPattern("MM/dd/yyyy");
    System.out.println(date.format(formatter));
  }
}
Output
10/10/2019

4. Using java.time.LocalTime

LocalTime represents a time without a time-zone in the ISO-8601 calendar system, such as 08:10:30.

Using now() method you can obtain the current time from the system clock in the default time-zone.

public class FormatDate {
  public static void main(String[] args) {
    // get time
    LocalTime date = LocalTime.now();
    DateTimeFormatter formatter = DateTimeFormatter.ofPattern("hh:mm:ss a");
    System.out.println(date.format(formatter));
  }
}
Output
05:11:31 PM

5. Using java.time.LocalDateTime

LocalDateTime represents a date-time without a time-zone in the ISO-8601 calendar system, such as 2007-12-03T10:15:30.

Using now() method you can obtain the current date-time from the system clock in the default time-zone.

public class FormatDate {
  public static void main(String[] args) {
    // get datetime
    LocalDateTime dateTime = LocalDateTime.now();
    DateTimeFormatter formatter = DateTimeFormatter.ofPattern("yyyy-MM-dd'T'HH:mm:ss.SSS");
    System.out.println(dateTime.format(formatter));
  }
}
Output
2019-10-10T17:14:41.098

6. Using java.time.ZonedDateTime

ZonedDateTime represents date-time with a time-zone in the ISO-8601 calendar system, such as 2007-12-03T10:15:30+01:00 Europe/Paris. If you want zone offset and time-zone then you can use ZonedDateTime instance.

public class FormatDate {
  public static void main(String[] args) {
    // get datetime
    ZonedDateTime dateTime = ZonedDateTime.now();
    //z=time-zone name, V=time-zone ID
    DateTimeFormatter formatter = DateTimeFormatter.ofPattern("yyyy-MM-dd'T'HH:mm:ss.SSS z VV");
    System.out.println(dateTime.format(formatter));
  }
}
Output
2019-10-10T17:22:31.958 IST Asia/Calcutta

That's all for the topic Getting Current Date and Time in Java. If something is missing or you have something to share about the topic please write a comment.


You may also like

Convert String to Date in Java

In this post we’ll see how to convert String to Date in Java.

For converting Date to String in Java check this post- Convert Date to String in Java

Before Java 8, SimpleDateFormat class was the option in Java for converting String to Date. Java 8 onward you can use classes in package java.time which are part of new date and time API for the conversion. We’ll see examples using methods of both of these classes.

Converting String to Date using SimpleDateFormat

You can use parse() method of the Java SimpleDateFormat class that parses text from a string to produce a Date.

First thing is to create an instance of SimpleDateFormat passing the date and time pattern for parsing. Then call parse() method passing the date String, method returns parsed Date. ParseException is thrown if unable to parse the requested result.

In the example different types of date Strings are converted to java.util.Date instances.

import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.Date;

public class ParseDate {
  public static void main(String[] args) {
    try {
      parseDT("dd/MM/yyyy", "09/08/2019");
      
      parseDT("MM-dd-yyyy", "09-08-2019");
      // Date will default to epoch (January 1, 1970)
      parseDT("HH:mm:ss", "20:04:19");
      
      parseDT("MM-dd-yyyy HH:mm:ss", "09-08-2019 20:04:19");
    }catch (ParseException e) {
      System.out.println("Error while parsing- " + e.getMessage());
    }

  }
	
  private static void parseDT(String pattern, String dateTime) throws ParseException{
    // Create date format as per specified pattern
    SimpleDateFormat sdf = new SimpleDateFormat(pattern);
    // parsing
    Date dt = sdf.parse(dateTime);
    System.out.println("After parsing- " + dt);
  }
}
Output
After parsing- Fri Aug 09 00:00:00 IST 2019
After parsing- Sun Sep 08 00:00:00 IST 2019
After parsing- Thu Jan 01 20:04:19 IST 1970
After parsing- Sun Sep 08 20:04:19 IST 2019

Converting String to Date using new Date & Time API

Java 8 onward you can use parse() method of the LocalDate (representing date), LocalTime (representing time) and LocalDateTime (representing date and time) to convert String to Date.

There are two variants of parse method-

  • parse(CharSequence text)– Here text parameter represents the date string that has to be parsed. String must represent a valid date, time or date-time
  • parse(CharSequence text, DateTimeFormatter formatter) –You can pass an instance of DateTimeFormatter representing formatter to be used for parsing.
Converting String to LocalDate
LocalDate dt = LocalDate.parse("2019-08-03");// Date in ISO-8601 format
Converting String to LocalTime
LocalTime dt = LocalTime.parse("10:15:30");// Time in ISO-8601 format
Converting String to LocalDateTime
LocalDateTime dt = LocalDateTime.parse("2007-12-03T10:15:30");// Date-Time in ISO-8601 format
Converting String to ZonedDateTime

If there is time zone information then use ZonedDateTime.

// Date-Time with time zone in ISO-8601 format
ZonedDateTime dt = ZonedDateTime.parse("2019-07-03T10:15:30+01:00[Europe/Paris]");

Converting String to Date using custom formatter

In DateTimeFormatter class there is a static method ofPattern() using which you can specify the pattern for date time formatting.

import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.time.LocalDate;
import java.time.LocalDateTime;
import java.time.LocalTime;
import java.time.ZonedDateTime;
import java.time.format.DateTimeFormatter;
import java.time.format.DateTimeParseException;
import java.util.Date;

public class ParseDate {

  public static void main(String[] args) {
    try{
      LocalDate localDate = LocalDate.parse("30/06/2019", DateTimeFormatter.ofPattern("dd/MM/yyyy"));
      System.out.println("Date " + localDate);
         
      localDate = LocalDate.parse("Thu, Sep 19, '19", DateTimeFormatter.ofPattern("EEE, MMM d, ''yy"));
      System.out.println("Date " + localDate);
      
      LocalTime localTime = LocalTime.parse("20:17:46", DateTimeFormatter.ofPattern("HH:mm:ss"));
      System.out.println("Time " + localTime);
      //DateTime
      LocalDateTime localDateTime = LocalDateTime.parse("2019-08-12T20:17:46.384Z", DateTimeFormatter.ofPattern("yyyy-MM-dd'T'HH:mm:ss.SSSz"));
      System.out.println("Date-Time " + localDateTime);
      
      // DateTime with zone offset
      ZonedDateTime zonedDateTime = ZonedDateTime.parse("2019-08-18 AD at 10:13:46 PDT", DateTimeFormatter.ofPattern("yyyy-MM-dd G 'at' HH:mm:ss z"));
      System.out.println("Date " + zonedDateTime);
      
      // DateTime with zone offset   
      zonedDateTime = ZonedDateTime.parse("2019-08-15 03:32:12-0430", DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ssxx"));
      System.out.println("Date " + zonedDateTime);
      
    }catch(DateTimeParseException ex){
      ex.printStackTrace();
    }		
  }
}
Output
Date 2019-06-30
Date 2019-09-19
Time 20:17:46
Date-Time 2019-08-12T20:17:46.384
Date 2019-08-18T10:13:46-07:00[America/Los_Angeles]
Date 2019-08-15T03:32:12-04:30

That's all for the topic Convert String to Date in Java. If something is missing or you have something to share about the topic please write a comment.


You may also like

Convert Date to String in Java

In this post we’ll see how to convert Date to String in Java.

For converting String to Date in Java check this post- Convert String to Date in Java

Before Java 8, SimpleDateFormat was the class to use for converting Date to String with the specified formatting. Java 8 onward there is another option java.time.format.DateTimeFormatter class that can be used for the conversion.

Converting Date to String using SimpleDateFormat

While creating an instance of SimpleDateFormat you can pass the pattern for formatting. SimpleDateFormat has a format method which takes Date instance as parameter and returns the formatted date (and time) string.

Here is an example where current date is converted to String using different date and time formatting patterns.

import java.text.SimpleDateFormat;
import java.util.Date;

public class FormatDate {
  public static void main(String[] args) {
    // For date in format 2019.07.04 AD at 11:08:54 IST
    formatDate("yyyy.MM.dd G 'at' HH:mm:ss z");
    // For date in format Mon, Oct 7, '19
    formatDate("EEE, MMM d, ''yy");
    // For date in format Monday, October 07, 2019
    formatDate("EEEE, MMMM dd, yyyy");
    // For time in format 07 o'clock PM, India Standard Time
    formatDate("hh 'o''clock' a, zzzz");
    // For time in 24 Hr format 19:41:59:635 PM
    formatDate("HH:mm:ss:SSS a");
    // For date-time in format 2019-10-07T19:27:38.571+0530
    formatDate("yyyy-MM-dd'T'HH:mm:ss.SSS Z");
    // For date in format 05/08/2016
    formatDate("MM/dd/yyyy");
    // For date in format 07/10/2019 19:29:40 PM
    formatDate("dd/MM/yyyy HH:mm:ss a");
    // For date in format 07/10/2019 19:29:40 PM
    formatDate("dd/MMM/yyyy GGG HH:mm:ss:SSS a");
  }

  private static void formatDate(String pattern){
    Date dt = new Date();
    // Create date format as per specified pattern
    SimpleDateFormat sdf = new SimpleDateFormat(pattern);
    String formattedDate = sdf.format(dt);
    System.out.println("Formatted Date- " + formattedDate +
              " for Pattern: " + pattern); 
  }
}
Output
Formatted Date- 2019.10.09 AD at 18:15:53 IST for Pattern: yyyy.MM.dd G 'at' HH:mm:ss z
Formatted Date- Wed, Oct 9, '19 for Pattern: EEE, MMM d, ''yy
Formatted Date- Wednesday, October 09, 2019 for Pattern: EEEE, MMMM dd, yyyy
Formatted Date- 06 o'clock PM, India Standard Time for Pattern: hh 'o''clock' a, zzzz
Formatted Date- 18:15:53:978 PM for Pattern: HH:mm:ss:SSS a
Formatted Date- 2019-10-09T18:15:53.979 +0530 for Pattern: yyyy-MM-dd'T'HH:mm:ss.SSS Z
Formatted Date- 10/09/2019 for Pattern: MM/dd/yyyy
Formatted Date- 09/10/2019 18:15:53 PM for Pattern: dd/MM/yyyy HH:mm:ss a
Formatted Date- 09/Oct/2019 AD 18:15:53:981 PM for Pattern: dd/MMM/yyyy GGG HH:mm:ss:SSS a

Converting Date to String using DateTimeFormatter

In DateTimeFormatter class there is a static method ofPattern() using which you can specify the pattern for date time formatting.

using format() method of the LocalDate (representing date), LocalTime (representing time) and LocalDateTime (representing date and time) you can convert Date to String.

DateTimeFormatter instance created using ofPattern() method is passed as parameter in format() method.

import java.time.LocalDate;
import java.time.LocalDateTime;
import java.time.LocalTime;
import java.time.format.DateTimeFormatter;

public class FormatDate {
  public static void main(String[] args) {
    // LocalDateTime
    // For date in format 2019.07.04 AD at 11:08:54 IST
    LocalDateTime dateTime = LocalDateTime.now();
    DateTimeFormatter formatter = DateTimeFormatter.ofPattern("yyyy.MM.dd G 'at' HH:mm:ss");
    String formattedDate = dateTime.format(formatter);
    System.out.println("Formatted Date- " + formattedDate);
    
    // For date in format Mon, Oct 7, '19
    formatter = DateTimeFormatter.ofPattern("EEE, MMM d, ''yy");
    formattedDate = dateTime.format(formatter);
    System.out.println("Formatted Date- " + formattedDate);

    // For date in format Monday, October 07, 2019
    formatter = DateTimeFormatter.ofPattern("EEEE, MMMM dd, yyyy");
    formattedDate = dateTime.format(formatter);
    System.out.println("Formatted Date- " + formattedDate);
    
    // For date-time in format 2019-10-07T19:27:38.571+0530
    formatter = DateTimeFormatter.ofPattern("yyyy-MM-dd'T'HH:mm:ss.SSS");
    formattedDate = dateTime.format(formatter);
    System.out.println("Formatted Date- " + formattedDate);
    
    // For date in format 07/10/2019 19:29:40 PM
    formatter = DateTimeFormatter.ofPattern("dd/MM/yyyy HH:mm:ss a");
    formattedDate = dateTime.format(formatter);
    System.out.println("Formatted Date- " + formattedDate);
    
    // For date in format 07/Oct/2019 AD 14:25:51:048 PM
    formatter = DateTimeFormatter.ofPattern("dd/MMM/yyyy GGG HH:mm:ss:SSS a");
    formattedDate = dateTime.format(formatter);
    System.out.println("Formatted Date- " + formattedDate);
    
    // LocalTime
    LocalTime time = LocalTime.now();
    // For time in 24 Hr format 19:41:59:635 PM
    formatter = DateTimeFormatter.ofPattern("HH:mm:ss:SSS a");
    formattedDate = time.format(formatter);
    System.out.println("Formatted Time- " + formattedDate);
    
    // LocalDate
    LocalDate date = LocalDate.now();
    // For date in format 05/08/2016
    formatter = DateTimeFormatter.ofPattern("MM/dd/yyyy");
    formattedDate = date.format(formatter);
    System.out.println("Formatted Date- " + formattedDate);
  }
	
}
Output
Formatted Date- 2019.10.10 AD at 14:27:38
Formatted Date- Thu, Oct 10, '19
Formatted Date- Thursday, October 10, 2019
Formatted Date- 2019-10-10T14:27:38.014
Formatted Date- 10/10/2019 14:27:38 PM
Formatted Date- 10/Oct/2019 AD 14:27:38:014 PM
Formatted Time- 14:27:38:194 PM
Formatted Date- 10/10/2019

If you have Zone offset (Z) or time zone name (z) in the patterns then you’d need a ZonedDateTime instance as LocalDateTime does not have a field or value for the timezone.

import java.time.ZonedDateTime;
import java.time.format.DateTimeFormatter;

public class FormatDate {

  public static void main(String[] args) {
    formatDate("yyyy.MM.dd G 'at' HH:mm:ss z");
    // For time in format 07 o'clock PM, India Standard Time
    formatDate("hh 'o''clock' a, zzzz");
    // For date-time in format 2019-10-07T19:27:38.571+0530
    formatDate("yyyy-MM-dd'T'HH:mm:ss.SSS Z");
  }

  private static void formatDate(String pattern){
    ZonedDateTime dateTime = ZonedDateTime.now();
    // Create DateTimeFormatter instance as per specified pattern
    DateTimeFormatter formatter = DateTimeFormatter.ofPattern(pattern);
    String formattedDate = dateTime.format(formatter);
    System.out.println("Formatted Date- " + formattedDate +
              " for Pattern: " + pattern); 
  }
}
Output
Formatted Date- 2019.10.09 AD at 18:25:00 IST for Pattern: yyyy.MM.dd G 'at' HH:mm:ss z
Formatted Date- 06 o'clock PM, India Standard Time for Pattern: hh 'o''clock' a, zzzz
Formatted Date- 2019-10-09T18:25:00.975 +0530 for Pattern: yyyy-MM-dd'T'HH:mm:ss.SSS Z

Using DateTimeFormatter.ofLocalizedDate() method

In DateTimeFormatter class there are also following static methods that can be used for converting date and time to String.

  • ofLocalizedDate(FormatStyle dateStyle)- Returns a locale specific date format for the ISO chronology.
  • ofLocalizedDateTime(FormatStyle dateTimeStyle)- Returns a locale specific date-time formatter for the ISO chronology.
  • ofLocalizedDateTime(FormatStyle dateStyle, FormatStyle timeStyle)- Returns a locale specific date and time format for the ISO chronology.
  • ofLocalizedTime(FormatStyle timeStyle)- Returns a locale specific time format for the ISO chronology.

Here java.time.format.FormatStyle is an Enum that has the following constant fields-

  • FULL- Full text style, with the most detail. For example, the format might be 'Tuesday, April 12, 1952 AD' or '3:30:42pm PST'.
  • LONG- Long text style, with lots of detail. For example, the format might be 'January 12, 1952'.
  • MEDIUM- Medium text style, with some detail. For example, the format might be 'Jan 12, 1952'.
  • SHORT- Short text style, typically numeric. For example, the format might be '12.13.52' or '3:30pm'.
import java.time.LocalDateTime;
import java.time.format.DateTimeFormatter;
import java.time.format.FormatStyle;

public class FormatDate {
  public static void main(String[] args) {
    LocalDateTime dateTime = LocalDateTime.now();
    System.out.println("Full format- " +dateTime.format(DateTimeFormatter.ofLocalizedDate(FormatStyle.FULL)));
    System.out.println("LONG format- " +dateTime.format(DateTimeFormatter.ofLocalizedDate(FormatStyle.LONG)));
    System.out.println("MEDIUM format- " +dateTime.format(DateTimeFormatter.ofLocalizedDate(FormatStyle.MEDIUM)));
    System.out.println("SHORT format- " +dateTime.format(DateTimeFormatter.ofLocalizedDate(FormatStyle.SHORT)));
  }
}
Output
Full format- Wednesday, 9 October, 2019
LONG format- 9 October 2019
MEDIUM format- 09-Oct-2019
SHORT format- 09/10/19

That's all for the topic Convert Date to String in Java. If something is missing or you have something to share about the topic please write a comment.


You may also like

Java Date Difference Program

In this post we’ll see how to calculate date and time difference in Java in terms of Years, months, days and hours, minutes, seconds.

To calculate difference between two dates in Java you can use SimpleDateFormat class though using that involves a lot of manual calculation and it doesn’t take time zones, day light saving into account.

To mitigate these shortcoming a new Date and Time API is added in Java 8 which provides classes to calculate date and time difference using inbuilt methods and also take into consideration time zones, day light saving and leap years while calculating difference.

Difference between two dates Using SimpleDateFormat

import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.Date;
import java.util.concurrent.TimeUnit;

public class DifferenceDate {
  public static void main(String[] args) {
    try {
      dateDiff("15/08/2019 09:10:05", "04/09/2019 14:22:15", "dd/MM/yyyy HH:mm:ss");
    } catch (ParseException e) {
      // TODO Auto-generated catch block
         e.printStackTrace();
    }
  }
	
  private static void dateDiff(String date1, String date2, String pattern) throws ParseException{
    SimpleDateFormat sdf = new SimpleDateFormat(pattern);
    Date d1 = sdf.parse(date1);
    Date d2 = sdf.parse(date2);
    long diffInMillis = d2.getTime() - d1.getTime();
    
    long daysDiff = TimeUnit.DAYS.convert(diffInMillis, TimeUnit.MILLISECONDS);
    
    long hoursDiff = TimeUnit.HOURS.convert(diffInMillis - (daysDiff * 24 * 60 * 60 * 1000), TimeUnit.MILLISECONDS);

    long minutesDiff = TimeUnit.MINUTES.convert(diffInMillis - (daysDiff * 24 * 60 * 60 * 1000) - (hoursDiff * 60 * 60 * 1000), TimeUnit.MILLISECONDS);

    long secondsDiff = TimeUnit.SECONDS.convert(diffInMillis - (daysDiff * 24 * 60 * 60 * 1000) - (hoursDiff * 60 * 60 * 1000) - (minutesDiff * 60 * 1000), TimeUnit.MILLISECONDS);

    System.out.println(daysDiff + " day(s) " + hoursDiff + " Hour(s) " + minutesDiff + " Minute(s) " + secondsDiff + " Second(s)");
  }
}
Output
20 day(s) 5 Hour(s) 12 Minute(s) 10 Second(s)

As you can see using SimpleDateFormat requires lot of manual effort where you need to convert as per the required time unit.

Difference between two dates using Java 8 classes

In the new date and time API in Java 8 there are following classes that can be used for date difference calculation.

  • java.time.Period- A date-based amount of time, supported units of a period are YEARS, MONTHS and DAYS.
  • java.time.Duration- A time-based amount of time. This class models a quantity or amount of time in terms of seconds and nanoseconds. It can be accessed using other duration-based units, such as minutes and hours.
  • java.time.temporal.TemporalUnit- TemporalUnit is an interface that represents a unit of date-time, such as Days or Hours.
  • java.time.temporal.ChronoUnit- It is an Enum that implements TemporalUnit interface.

Difference between two dates in terms of years, months, days

Difference between two dates in date-based amount of time can be calculated using Period class.

import java.time.LocalDate;
import java.time.Period;

public class DifferenceDate {
  public static void main(String[] args) {
    LocalDate date1 = LocalDate.of(2018, 8, 15);
    LocalDate date2 = LocalDate.of(2019, 9, 4);
    dateDiff(date1, date2);
  }
	
  private static void dateDiff(LocalDate date1, LocalDate date2){
    Period p = Period.between(date1, date2);		
    System.out.printf("%d Year(s) %d Month(s) %d Day(s)", p.getYears(), p.getMonths(), p.getDays());
  }
}
Output
1 Year(s) 0 Month(s) 20 Day(s)

Difference between two dates in terms of days, hours, minutes, seconds

Difference between two dates in a time-based amount of time can be calculated using Duration class.

public class DifferenceDate {

  public static void main(String[] args) {
    LocalDateTime date1 = LocalDateTime.of(2019, 9, 3, 9, 10, 5);
    LocalDateTime date2 = LocalDateTime.of(2019, 9, 4, 14, 22, 15);
    dateDiff(date1, date2);
  }
	
  private static void dateDiff(LocalDateTime date1, LocalDateTime date2){
    Duration d = Duration.between(date1, date2);	
    System.out.printf("%d Day(s) %d Hour(s) %d Minute(s) %d Second(s)", d.toDays(), d.toHoursPart(), d.toMinutesPart(), d.toSecondsPart());
  }
}
Output
1 Day(s) 5 Hour(s) 12 Minute(s) 10 Second(s)

Using both Period and Duration classes to calculate date difference

import java.time.Duration;
import java.time.LocalDateTime;
import java.time.Period;

public class DifferenceDate {
  public static void main(String[] args) {
    LocalDateTime date1 = LocalDateTime.of(2018, 7, 2, 12, 18, 13);
    LocalDateTime date2 = LocalDateTime.of(2019, 9, 4, 14, 22, 15);
    dateDiff(date1, date2);
  }

  private static void dateDiff(LocalDateTime date1, LocalDateTime date2){
    Period p = Period.between(date1.toLocalDate(), date2.toLocalDate());
    Duration d = Duration.between(date1, date2);
    System.out.printf("%d Year(s) %d Month(s) %d Day(s) %d Hour(s) %d Minute(s) %d Second(s)", 
        p.getYears(), p.getMonths(), p.getDays(), d.toHoursPart(), d.toMinutesPart(), d.toSecondsPart());
  }
}
Output
1 Year(s) 2 Month(s) 2 Day(s) 2 Hour(s) 4 Minute(s) 2 Second(s)

Using ChronoUnit to find difference

If you want the total difference in terms of units then ChronoUnit can also be used.

public class DifferenceDate {

  public static void main(String[] args) {
    LocalDateTime date1 = LocalDateTime.of(2019, 9, 3, 9, 10, 5);
    LocalDateTime date2 = LocalDateTime.of(2019, 9, 4, 14, 22, 15);
    dateDiff(date1, date2);
  }
	
  private static void dateDiff(LocalDateTime date1, LocalDateTime date2){
    long daysDiff = ChronoUnit.DAYS.between(date1, date2);
    long hoursDiff = ChronoUnit.HOURS.between(date1, date2);
    long minutesDiff = ChronoUnit.MINUTES.between(date1, date2);
    long secondsDiff = ChronoUnit.SECONDS.between(date1, date2);
    long millisDiff = ChronoUnit.MILLIS.between(date1, date2);
    long nanoDiff = ChronoUnit.NANOS.between(date1, date2);
    
    System.out.println("Days- "+ daysDiff);
    System.out.println("Hours- "+ hoursDiff);
    System.out.println("Minutes- "+ minutesDiff);
    System.out.println("Seconds- "+ secondsDiff);
    System.out.println("Millis- "+ millisDiff);
    System.out.println("Nano Seconds- "+ nanoDiff);
  }
}
Output
Days- 1
Hours- 29
Minutes- 1752
Seconds- 105130
Millis- 105130000
Nano Seconds- 105130000000000

That's all for the topic Java Date Difference Program. If something is missing or you have something to share about the topic please write a comment.


You may also like

How MapReduce Works in Hadoop

In the post WordCount MapReduce program we have seen how to write a MapReduce program in Java, create a jar and run it. There are a lot of things that you do to create a MapReduce job and Hadoop framework also do a lot of processing internally. In this post we’ll see in detail how MapReduce works in Hadoop internally using the word count MapReduce program as example.

What is MapReduce

Hadoop MapReduce is a framework for writing applications that can process huge data in parallel, by working on small chunks of data in parallel on cluster of nodes. The framework ensures that this distributed processing happens in a reliable, fault-tolerant manner.

Map and Reduce

A MapReduce job in Hadoop consists of two phases-

  • Map phase– It has a Mapper class which has a map function specified by the developer. The input and output for Map phase is a (key, value) pair. When you copy the file that has to be processed to HDFS it is split into independent chunks. Hadoop framework creates one map task for each chunk and these map tasks run in parallel.
  • Reduce phase- It has a Reducer class which has a reduce function specified by the developer. The input and output for Reduce phase is also a (key, value) pair. The output of Map phase after some further processing by Hadoop framework (known as sorting and shuffling) becomes the input for reduce phase. So the output of Map phase is the intermediate output and it is processed by Reduce phase to generate the final output.

Since input and output for both map and reduce functions are key, value pair so if we say input for map is (K1, V1) and output is (K2, V2) then map function input and output will have the following form-

(K1, V1) -> list(K2, V2)

The intermediate output of the map function goes through some further processing with in the framework, known as shuffle and sort phase, before inputting to reduce function. The general form for the reduce function can be depicted as follows-

(K2, list(V2)) -> list(K3, V3)

Here note that the types of the reduce input matches the types of map output.

MapReduce explanation with example

Let’s take Word count MapReduce code as example and see what all happens in both Map and Reduce phases and how MapReduce works in Hadoop.

When we put the input text file into HDFS it is split into chunks of data. For simplicity sake let’s say we have two lines in the file and it is split into two parts with each part having one line.

If the text file has following two lines-

This is a test file
This is a Hadoop MapReduce program file

Then there will be two splits and two map tasks will get those two splits as input.

Mapper class

// Map function
public static class WordMapper extends Mapper<LongWritable, Text, Text, IntWritable>{
  private final static IntWritable one = new IntWritable(1);
  private Text word = new Text();
  public void map(LongWritable key, Text value, Context context) 
     throws IOException, InterruptedException {
    // Splitting the line on spaces
    String[] stringArr = value.toString().split("\\s+");
    for (String str : stringArr) {
      word.set(str);
      context.write(word, one);
    }
  }
}

In the Mapper class you can see that it has four parameters first two specify the input to the map function and other to specify the output of the map function.

In this Word count program input key value pair will be as follows-

key- byte offset into the file at which the line starts.

Value– Content of the line.

As we assumed there will be two splits (each having one line of the file) and two map tasks let’s say Map-1 and Map-2, so input to Map-1 and Map-2 will be as follows.

Map-1– (0, This is a test file)

Map-2– (0, This is a Hadoop MapReduce program file)

Logic in map function is to split the line on spaces and the write each word to the context with value as 1.

So output from Map-1 will be as follows-

(This, 1)
(is, 1)
( a, 1)
(test, 1)
(file, 1)

And output from Map-2 will be as follows-

(This, 1)
(is, 1)
(a, 1)
(Hadoop, 1)
(MapReduce, 1)
(program, 1)
(file, 1)
Reducer class
// Reduce function
public static class CountReducer extends Reducer<Text, IntWritable, Text, IntWritable>{	   
  private IntWritable result = new IntWritable();

  public void reduce(Text key, Iterable<IntWritable> values, Context context) 
      throws IOException, InterruptedException {
    int sum = 0;
    for (IntWritable val : values) {
      sum += val.get();
    }
    result.set(sum);
    context.write(key, result);
  }
}

In the Reducer class again there are four parameters two for input types and two for output types of the reduce function.

Note that input type of the reduce function must match the output types of the map function.

This intermediate output from Map will be further processed by the Hadoop framework in the shuffle phase where it will be sorted and grouped as per keys, after this internal processing input to reduce will look like this-

[Hadoop, (1)]
[MapReduce, (1)]
[This, (1, 1)]
[a, (1, 1)]
[file, (1, 1)]
[is, (1, 1)]
[program, (1)]
[test, (1)]

You can see that the input to the reduce function is in the form (key, list(values)). In the logic of the reduce function, for each key value pair list of values is iterated and values are added. That will be the final output.

Hadoop 1
MapReduce 1
This 2
a 2
file. 2
is 2
program 1
test 1

Related Posts

That's all for the topic How MapReduce Works in Hadoop. If something is missing or you have something to share about the topic please write a comment.


You may also like

Hadoop MapReduce Word Count Program

Once you have installed Hadoop on your system and initial verification is done you would be looking to write your first MapReduce program. Before digging deeper into the intricacies of MapReduce programming first step is the word count MapReduce program in Hadoop which is also known as the "Hello World" of the Hadoop framework.

So here is a simple Hadoop MapReduce word count program written in Java to get you started with MapReduce programming.

What you need

  1. It will be good if you have any IDE like Eclipse to write the Java code.
  2. A text file which is your input file. It should be copied to HDFS. This is the file which Map task will process and produce output in (key, value) pairs. This Map task output becomes input for the Reduce task.

Process

These are the steps you need for executing your Word count MapReduce program in Hadoop.

  1. Start daemons by executing the start-dfs and start-yarn scripts.
  2. Create an input directory in HDFS where you will keep your text file.
    bin/hdfs dfs -mkdir /user
    
    bin/hdfs dfs -mkdir /user/input
    
  3. Copy the text file you created to /usr/input directory.
    bin/hdfs dfs -put /home/knpcode/Documents/knpcode/Hadoop/count /user/input
    

    I have created a text file called count with the following content

    This is a test file.
    This is a test file.
    

    If you want to verify that the file is copied or not, you can run the following command-

    bin/hdfs dfs -ls /user/input
    
    Found 1 items
    -rw-r--r--   1 knpcode supergroup         42 2017-12-22 18:12 /user/input/count
    

Word count MapReduce Java code

import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

public class WordCount {
  // Map function
  public static class WordMapper extends Mapper<LongWritable, Text, Text, IntWritable>{
    private final static IntWritable one = new IntWritable(1);
    private Text word = new Text();
    public void map(LongWritable key, Text value, Context context) 
        throws IOException, InterruptedException {
      // Splitting the line on spaces
      String[] stringArr = value.toString().split("\\s+");
      for (String str : stringArr) {
        word.set(str);
        context.write(word, one);
      }       
    }
  }
	
  // Reduce function
  public static class CountReducer extends Reducer<Text, IntWritable, Text, IntWritable>{		   
    private IntWritable result = new IntWritable();
    public void reduce(Text key, Iterable values, Context context) throws IOException, InterruptedException {
      int sum = 0;
      for (IntWritable val : values) {
        sum += val.get();
      }
      result.set(sum);
      context.write(key, result);
    }
  }
	
  public static void main(String[] args) throws Exception{
    Configuration conf = new Configuration();
    Job job = Job.getInstance(conf, "word count");
    job.setJarByClass(WordCount.class);
    job.setMapperClass(WordMapper.class);    
    job.setReducerClass(CountReducer.class);
    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(IntWritable.class);
    FileInputFormat.addInputPath(job, new Path(args[0]));
    FileOutputFormat.setOutputPath(job, new Path(args[1]));
    System.exit(job.waitForCompletion(true) ? 0 : 1);
  }
}

You will need at least the given jars to compile your MapReduce code, you will find them in the share directory of your Hadoop installation.

Word count MapReduce program jars

Running the word count MapReduce program

Once your code is successfully compiled, create a jar. If you are using eclipse IDE you can use it to create the jar by Right clicking on project – export – Java (Jar File)

Once jar is created you need to run the following command to execute your MapReduce code.

bin/hadoop jar /home/knpcode/Documents/knpcode/Hadoop/wordcount.jar org.knpcode.WordCount /user/input /user/output

In the above command

/home/knpcode/Documents/knpcode/Hadoop/wordcount.jar is the path to your jar.

org.knpcode.WordCount is the fully qualified name of Java class that you need to run.

/user/input is the path to input file.

/user/output is the path to output

In the java program in the main method there were these two lines-

FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));

That’s where input and output directories will be set.

To see an explanation of word count MapReduce program working in detail, check this post- How MapReduce Works in Hadoop

After execution you can check the output directory for the output.

bin/hdfs dfs -ls /user/output

Found 2 items
-rw-r--r--   1 knpcode supergroup          0 2017-12-22 18:15 /user/output/_SUCCESS
-rw-r--r--   1 knpcode supergroup         31 2017-12-22 18:15 /user/output/part-r-00000

The output can be verified by listing the content of the created output file.

bin/hdfs dfs -cat /user/output/part-r-00000
This	2
a	2
file.	2
is	2
test	2

Related Posts

That's all for the topic Hadoop MapReduce Word Count Program. If something is missing or you have something to share about the topic please write a comment.


You may also like

What is Big Data

Big Data means a very large volume of data. Term big data is used to describe data so huge and ever growing that has gone beyond the storage and processing capabilities of traditional data management and processing tools.

Some Examples

What to do with Big Data

Giving such examples of having petabytes of data is fantastic but the question is what to do with that kind of data. Big Data is not just examples of huge volume of data generation. One aspect of Big Data is to come up with technologies to store such huge data but another, and more important aspect, is to be able to analyze that data and use it to make business decisions faster, more accurately, to have more understanding of consumer behavior.

Data in Big Data

Data in Big Data can be any type of data; structured, semi-structured, unstructured such as text, video, audio, sensor data, log files etc.

  1. Structured data– Any data that is organized in a format that is fixed can be termed as structured data such as data stored in relational databases or in spread sheet. For creating structured data you will have predefined rules on what type of data will be stored and how that data will be stored.
  2. Semi-structured data– Any data that doesn’t confirm to the rigid structure associated with the structured data but still have some structure like having tags or other markers to separate and identify different elements and have hierarchies of records and fields with in the data can be termed as semi-structured data. As example– XML,  JSON.
  3. Unstructured data– As the name suggests unstructured data is exact opposite of structured data which means it doesn’t confirm to any predefined rules in terms of type of data and field positions with in a file or record. Unstructured data usually include multiple types of data where you may have a combination of text, videos, images that too in no defined manner. Examples of unstructured data are books, any web page, email message etc. Because of it’s not fitting to any defined format it becomes very difficult to analyze unstructured data.

3 Vs of Big Data

Big Data can be described by following characteristics-

  • Volume– This characteristic refers to the volume of data that is generated and stored. It’s the size of data that determines the potential insight that can be derived from it and even determines whether the data can actually be considered as big data or not.
  • Velocity– This characteristic refers to the speed at which data is generated and processed. As example- Processing trade data created each day in a stock exchange to identify potential fraud. Analyzing click stream data of a consumer in real time to provide consumer with suitable alternatives or products.
  • Variety- This characteristic refers to the type and nature of the data. Data may be structured, unstructured, semi-structured. Analyzing all these types of data together provide better insights.

These 3 Vs are expanded and now even termed as 5 Vs to add new characteristics to Big Data.

  • Variability– This characteristic refers to the inconsistency of the data flow. There may be some peak times when data flow is quite huge which may render the processes in place, to handle and manage data, ineffective.
  • Veracity- This characteristic refers to the quality of data collected from multiple sources.

Some Big Data technologies

Some of the Big data technologies for storing and analyzing big data are-

  • Apache Hadoop– Actually over the years Hadoop has grown to have a whole ecosystem of related technologies like Hadoop, HDFS, Hive, PIG even Apache Spark.
  • NoSQL Databases- For storing unstructured data and providing very fast performance. Some of the NoSQL databases are MongoDB, Cassandra, Hbase.
  • Presto– Developed by Facebook, Presto is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes.

Related Posts

That's all for the topic What is Big Data. If something is missing or you have something to share about the topic please write a comment.


You may also like

How to Improve Map-Reduce Performance

In this post we’ll see some of the ways to improve performance of the Map-Reduce job in Hadoop.

The tips given here for improving the performance of MapReduce job are more from the MapReduce code and configuration perspective rather than cluster and hardware perspective.

1- Enabling uber mode– Like Hadoop 1 there is no JVM resuse feature in YARN Hadoop but you can enable the task to run in Uber mode, by default uber is not enabled. If uber mode is enabled ApplicationMaster can calculate that the overhead of negotiating resources with ResourceManager, communicating with NodeManagers on different nodes to launch the containers and running the tasks on those containers is much more that running MapReduce job sequentially in the same JVM, it can run a job as uber task.

2- For compression try to use native library- When using compression and decompression in Hadoop it is better to use native library as native library will outperform codec written in programming language like Java.

3- Increasing the block size- In case input file is of very large size you can consider improving the hdfs block size to 512 M. That can be done by setting the parameter dfs.blocksize. If you set the dfs.blocksize to a higher value input split size will increase to same size because the input size is calculated using the formula.

Math.max(mapreduce.input.fileinputformat.split.minsize, Math.min(mapreduce.input.fileinputformat.split.maxsize, dfs.blocksize))

thus making it of same size as HDFS block size. By increasing the block size you will have less overhead in terms of metadata as there will be less number of blocks.

If input split is larger, Map tasks will get more data to process. In Hadoop as many map tasks are started as there are input splits so having less input splits means the overhead to initialize map tasks is reduced.

4- Time taken by map tasks- A map task should run for at least a minute (1-3 minutes) if it is finishing with in less than a minute that means input data to a map task is less. If there are many small files in your map reduce job then try to use a container file format like Sequence file or Avro that contains those small files.

You can also use CombineFileInputFormat which put many files into an input split so that there is more data for mapper to process.

5- Input data compression is splittable or not- If input data is compressed then the compression format used is splittable or not is also one of the thing to consider. If input data is not splittable there would only be a single split processed by a single map task making the processing very slow and no parallelism at all.

For compressing input data compress using bzip2 which is splittable or using lzo with indexing to make it splittable.

6- Setting number of reduce tasks- The number of maps is usually driven by the number of input splits but number of reducers can be controlled. As per the documentation; the right number of reduces seems to be 0.95 or 1.75 multiplied by (<no. of nodes> * <no. of maximum containers per node>).

With 0.95 all of the reduces can launch immediately and start transferring map outputs as the maps finish. With 1.75 the faster nodes will finish their first round of reduces and launch a second wave of reduces doing a much better job of load balancing.

Increasing the number of reduces increases the framework overhead, but increases load balancing and lowers the cost of failures.

The number of reduces for the job is set by the user via Job.setNumReduceTasks(int).

7- Data skew at reducer side- If data is skewed in such a way that more values are grouped with a single key rather than having an even distribution of values then reduce tasks which process keys with more values will take more time to finish where as other reducers will get less data because of the uneven distribution and finish early.

In this type of scenario try to analyze the partition of data and look at the possibility of writing a custom partitioner so that data is evenly distributed among keys.

8- Shuffle phase performance improvements- Shuffle phase in Hadoop framework is very network intensive as files are transferred from mappers to reducers. There is lots of IO involve as map output is written to local disk, there is lots of processing also in form of partitioning the data as per reducers, sorting data by keys, merging.

Optimization for reducing the shuffle phase time helps in reducing the overall job time. Some of the performance improvement tips are as follows-

  • Compressing the map output- Since Map output is written to disk and also transferred to the reducer, compressing map output saves storage space, makes it faster to write to disk and reduces data that has to be transferred to reducer node.
  • Filtering data- See how you can cut down on data emitted by Map tasks. Filter the records to remove unwanted records entirely. Also, reduce the record size by taking only the relevant record fields.
  • Using Combiner- Using combiner in MapReduce is a good way to improve performance of the overall MapReduce job. By using combiner you can aggregate data in the map phase itself and reduce the number of records sent to the reducer.
  • Raw Comparator- During sorting and merging Hadoop framework uses comparator to compare keys. If you are using a custom comparator then try to write it to be a raw comparator so that comparison can be done at the byte level itself. Otherwise keys in the map tasks are to be deserialized to create an object and then compare making the process time consuming.
  • Setting parameters with optimum values- Another action you can take to improve performance of the MapReduce job is to change values of some of the configuration parameters.

    Your goal is to reduce the records spilled to disk at map as well as reduce side. At map side you can change the setting for the following parameters to try to reduce the number of spills to disk.

    • mapreduce.task.io.sort.mb- The total amount of buffer memory to use while sorting files, in megabytes.
    • mapreduce.map.sort.spill.percent- The soft limit in the serialization buffer. Once reached, a thread will begin to spill the contents to disk in the background.At reduce side you can change the setting for the following parameters to try to keep data in memory itself.
    • mapreduce.reduce.shuffle.input.buffer.percent- The percentage of memory to be allocated from the maximum heap size to storing map outputs during the shuffle.
    • mapreduce.reduce.input.buffer.percent- The percentage of memory- relative to the maximum heap size- to retain map outputs during the reduce.
    • mapreduce.reduce.shuffle.memory.limit.percent- Maximum percentage of the in-memory limit that a single shuffle can consume.

9-Improvements in MapReduce coding- You should also optimize your MapReduce code so that it runs efficiently.

  • Reusing objects- Since map method is called many times so creating new objects judiciously will help you to reduce overhead associated with object creation. Try to reuse objects as much as you can. One of the mistake which is very frequent is writing code as follows.
    String[] stringArr = value.toString().split("\\s+");
    Text value = new Text(stringArr[0]);
    context.write(key, value);
    

    You should write it as following-

    private Text value = new Text();
    public void map(LongWritable key, Text value, Context context) 
        throws IOException, InterruptedException {
      String[] stringArr = value.toString().split("\\s+");
      value.set(stringArr[0]);// reusing object
      context.write(key, value);
    }
    
  • String concatenation- Since String in Java is immutable so String concatenation results in String object creation. For appending prefer StringBuffer or StringBuilder instead.
Related Posts

That's all for the topic How to Improve Map-Reduce Performance. If something is missing or you have something to share about the topic please write a comment.


You may also like