Remove duplicate lines from file using awk

We mostly have the requirement to remove duplicate lines from file, support guys know the pain well 🙂

Lets look at the quickest solution to achieve this.

Requirement:

Image of Remove Duplicate Lines Requirement

 

Solution :

awk ‘!x[$0]++’ file1.txt

Explanation :

  • x[$0]: look at the value of key $0, in associative array x. If it does not exist, create it.
  • x[$0]++: increment the value of x[$0], return the old value as value of expression. If x[$0]does not exist, return 0 and increment x[$0] to 1 (++ operator returns numeric value).
  • !x[$0]++: negate the value of expression. If x[$0]++ return 0, the whole expression is evaluated to true, make awk performed default action print $0. Otherwise, the whole expression is evaluated to false, causes awk do nothing.

References:

Image of Remove Duplicate Lines Solution

Vikram Pawar

Vikram Pawar

Coder by passion..... Thirsty for knowledge, challenges & most importantly innovation. Working with BNP Paribas

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *