Remove only certain duplicate lines with awk

bgstack15

2017-04-11 09:33

Basic solution

http://www.unix.com/shell-programming-and-scripting/153131-remove-duplicate- lines-using-awk.html demonstrates and explains how to use awk to remove duplicate lines in a stream without having to sort them. This statement is really useful. awk '!x[$0]++'

The fancy solution

But if you need certain duplicated lines preserved, such as the COMMIT statements in the output of iptables-save, you can use this one-liner: iptables-save | awk '!asdf[$0]++; /COMMIT|Completed on|Generated by/;' | uniq The second awk rule prints again any line that matches "COMMIT" or "Completed on" or "Generated by," which appear multiple times in the iptables- save output. I was programmatically adding rules and one host in particular was just adding new ones despite the identical rule already existing. So I had to remove the duplicates and save the output, but keep all the duplicate "COMMIT" statements. I also wanted to keep all the comments as well.

Knowledge Base

Basic solution

The fancy solution

Comments