Understanding The diff Command In Unix

April 17th, 2009

It’s not an uncommon requirement for Unix system administrators to know the difference between two files. The diff command in Unix serves the purpose. Here I am going to discuss the diff command. It’s quite common but little understood command. I hope after reading this article, the Unix visitors will be able to understand the usage properly and benefit from it. The other usefule command is: comm command. Here you go…

The example files are first and second. The example files are listed below:

wiw_labs:$ nl first
1 computer
2 modem
3 monitor
4 phone
5 switch

wiw_labs:$ nl second
1 cable
2 mobile
3 screen
4 modem
5 phone
6 server

The diff command is used to differentiate between the files.

How diff Command Works
Let’s start by describing the usage of diff command. The diff command general usage is:
diff first_file second_file

So, you can read the command as:
How first_file is different from second_file.

Philosophy of diff Command
The diff command works on the philosophy of changing the first file in any way to make it appear like second file. It wants the lines of the first file to be changed(c), deleted(d) to make it ditto as second file. If need be, it instructs to append the lines from second file to the first file. If you got what I said is okay, otherwise leave it, You’ll understand when I explain it with example.

Here are the steps which diff command follows to produce the difference between the files:

  1. It starts with the first line of the first file and second file. If these match then it’s okay otherwise it keeps on traveling down the first file till it finds the similar entry in second file.
  2. If first line of second file is not found in the first file, it’ll start with the second line of the second file. It’ll start it’s search in the first file. Then it’ll suggest what to do(append, change or delete).

Enough about theory. Let’s come to practical example to make it clear.
I have pasted the files side by side to make it easy to understand. Besides line numbers are also printed.

wiw_labs:$ paste first second|nl
1 computer cable
2 modem mobile
3 monitor screen
4 phone modem
5 switch phone
6 server

wiw_labs:$ diff first second
1c1,3
< computer

> cable
> mobile
> screen
3d4
< monitor
5c6
< switch

> server

Now, take a look at numbered output of paste command above. The things to be noted are:

  1. The second line(modem) of first file matches with the fourth line(modem) of second file. So, if we replace the first line of first file with first three lines of second file then first part of both file becomes same. The output will resemble as below:
  2. wiw_labs:$ paste first second|nl
    1 cable cable
    2 mobile mobile
    3 screen screen
    4 modem modem
    5 monitor phone
    6 phone server
    7 switch

  3. The fourth line(phone) of first file matches with fifth line(phone) of the second file. That means if we delete the third line of first file(which is the fourth line at present, the second part of files will match.
  4. wiw_labs:$ paste first second|nl
    1 cable cable
    2 mobile mobile
    3 screen screen
    4 modem modem
    5 phone phone
    6 switch server

  5. The fifth line(switch) of first file can be replace with 6th line(server) of second file. So, both of the files match fully.

wiw_labs:$ paste first second|nl
1 cable cable
2 mobile mobile
3 screen screen
4 modem modem
5 phone phone
6 server server

Now, its easier to understand the output of diff command.
1c1,3: Change the first line of first file with lines 1 to 3 of second file.
3d4: Delete the line 3(modem) from first file.
5c6: Change the 5th line(switch) of first file with 6th line(server) of second file.

Now, take the reverse case:

wiw_labs:$ paste second first | nl
1 cable computer
2 mobile modem
3 screen monitor
4 modem phone
5 phone switch
6 server

wiw_labs:$ diff second first
1,3c1
< cable
< mobile
< screen

> computer
4a3
> monitor
6c5
< server

> switch

  1. Now, see the 4th line(modem) of the first file matches with the 2nd line of the second file. So, if we replace the lines 1st through 3rd of first file with the 1st line of second file we get the following output:
  2. wiw_labs:$ paste second first | nl
    1 computer computer
    2 modem modem
    3 phone monitor
    4 server phone
    5 switch

  3. Now, 3rd line (monitor) of second file does not exist in first file. So, append it after 4th line(modem) of first file. Do remember that line numbers specified in output of diff command are always the original line number. So, output will be something like this.
  4. wiw_labs:$ paste second first | nl
    1 computer computer
    2 modem modem
    3 monitor monitor
    4 phone phone
    5 server switch

  5. The last line, 6th line(server) of first file now needs to be changed with the last line 5th line of second file(switch). After doing so, we get first file as second file.

1 computer computer
2 modem modem
3 monitor monitor
4 phone phone
5 switch switch

Now, its easier to understand the output of diff command.
1,3c1: Change the 1st through 3rd line of first file with lines 1st of second file.
4a3: Append the line 3(monitor) from second file after 4th line(modem) of first file.
6c5: Change the 6th line(server) of first file with 5th line(switch) of second file.

Tags: , , , , , ,
Posted in Tips and Tricks | No Comments »

Comments

Leave a Reply

 Comment Form 

 



More articles from the category: Tips and Tricks


What is The Difference Between exec and xargs

There are very small things in linux based systems which often consfuse the users. Here in this article I’m going to discuss the specific use of find command and difference between exec and xargs.

Understanding The diff Command In Unix

It’s not an uncommon requirement for Unix system administrators to know the difference between two files. The diff command in Unix serves the purpose. Here I am going to discuss the diff command. It’s quite common but little understood command. I hope after reading this article, the Unix visitors will be able to understand the usage properly and benefit from it. Here you go…

How To Find Common Lines Between Two Text Files In Unix

Those working in Unix environment must be aware of the requirement to find common
lines between two Unix files. In this article I’m gonna introduce you to solve this problem.

How To Join Two Files Vertically

In Unix, sometimes we come across situations where we need to join two files’ output side by side vertically. This is also referred to as vertical joining of files. In such situations, paste command comes handy. With the help of this command you not only can vertically join the files, but insert some delimiter as well….

How To Recover A Superblock

If fsck or mount commands give errors then pretty good chances are there that the superblock is corrupt. The dd command comes to rescue. We know that the superblock resides in 31st block also. So, to recover that we use the following command: # dd count=1 bs=4k skip=31 seek=1 if=/dev/my_lv  of=/dev/my_lv count: How many blocks […]

Ways To Zip The Directory Structure of Unix

Often its the requirement of system administrators to move the directories in between the servers. Also the confusion arises which utility to use for the best results. Normally zip, tar, cpio utilities are used the most. I’m writing a short introduction of zip and tar.

How To Split or Cut A File Vertically(Column wise)

In Unix environment it’s often required to print specific columns from a file. If the rows of file are having some proper delimiters then other Unix command like awk come into picture and quite handy. But if say you want to print out the 5th character and then 7th to 16th character, then other commands fail and in such situations the cut command comes to rescue. Here is short primer of cut command.