Celeb Glow
updates | March 20, 2026

awk - compare two files and print all columns from both files

I want to compare two files

File 1:

evm.TU.PTPU-T1. PF00808
evm.TU.PTP-T1 PF00498
evm.TU.PTPX-T1 PF00250
evm.TU.PAN-T1 PF00817

File 2:

PF00808 CL0012 Histone CBFD_NFYB_HMF Histone-like transcription factor
PF00498 CL0357 SMAD-FHA FHA FHA domain
PF00817 CL0123 HTH Forkhead Forkhead domain

Output:

evm.TU.PTPU-T1 PF00808 CL0012 Histone CBFD_NFYB_HMF Histone-like
evm.TU.PTP-T1 PF00498 CL0357 SMAD-FHA FHA FHA domain
evm.TU.PAN-T1 PF00817 CL0123 HTH Forkhead Forkhead domain

I tried the below command

awk 'FNR==NR{a[$1]=$2;next} ($1 in a){print $0,a[$1]}' file2 file1 >file3

but it is printing only the second column of the file 2, not the entire line.

PF00808 evm.TU.PTPU-T1 CL0012

Please let me know how to add the entire matched line of file 2 to the output and not just the second column

6

1 Answer

You have a couple of options here:

  1. save whole lines $0 of File2 into an array keyed on its $1; then look up $1 of File1 based on the key in its $2:

     $ awk 'NR==FNR{a[$1]=$0; next} ($2 in a){print $1,a[$2]}' File2 File1 evm.TU.PTPU-T1. PF00808 CL0012 Histone CBFD_NFYB_HMF Histone-like transcription factor evm.TU.PTP-T1 PF00498 CL0357 SMAD-FHA FHA FHA domain evm.TU.PAN-T1 PF00817 CL0123 HTH Forkhead Forkhead domain
  2. save the $1 values of File1 keyed on its $2 then look up the corresponding whole lines of File2 based on the key in its $1

     $ awk 'NR==FNR{a[$2]=$1} ($1 in a){print a[$1], $0}' File1 File2 evm.TU.PTPU-T1. PF00808 CL0012 Histone CBFD_NFYB_HMF Histone-like transcription factor evm.TU.PTP-T1 PF00498 CL0357 SMAD-FHA FHA FHA domain evm.TU.PAN-T1 PF00817 CL0123 HTH Forkhead Forkhead domain
7

Your Answer

Sign up or log in

Sign up using Google Sign up using Facebook Sign up using Email and Password

Post as a guest

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy