How to Merge multiple CSV Files in Linux Mint
In this short tutorial, I'll show you several ways to merge multiple files with bash in Linux Mint. The examples will cover merging with and without the headers.
It's assumed that CSV files are identical if not you can check Python solution: How to merge multiple CSV files with Python
Let's have 2 csv files:
- file1.csv
col1 | col2 | col3 |
---|---|---|
1 | Jake | 100 |
2 | John | 95 |
3 | Joe | 87 |
- file2.csv
col1 | col2 | col3 |
---|---|---|
21 | Ema | 96 |
22 | Eli | 88 |
23 | Erica | 100 |
24 | Emily | 77 |
- merged.csv - the result after concatenation:
col1 | col2 | col3 |
---|---|---|
1 | Jake | 100 |
2 | John | 95 |
3 | Joe | 87 |
21 | Ema | 96 |
22 | Eli | 88 |
23 | Erica | 100 |
24 | Emily | 77 |
Example 1: Append multiple CSV files in bash with(out) header
The first example will merge multiple CSV or text files by combining head
and tail
commands in Linux. If the files don't have headers only head is enough:
tail -n+1 -q *.csv >> merged.out
In case of headers - head
can get the header from one file and the values to be collected with tail
head -n 1 file1.csv > merged.out && tail -n+2 -q *.csv >> merged.out
Note 1: if the last line in the file doesn't have new line separator this way will produce results:
3,Joe,87col1,col2,col3
In this case you can check the second example.
Note 2: parameter -q
is used to remove extra text for command tail.
Example 2: Merge CSV files without new line in Linux terminal
This example shows an improved version of the first one which solves the problem of missing a new line at the end of each CSV file. This is the short version:
head -n 1 1.csv > combined.out
for f in *.csv; do tail -n 2 "$f"; printf "\n"; done >> combined.out
and prettified shell version will be:
for f in *.csv; do
tail -n 2 "$f";
printf "\n";
done >> merged.out
result:
col1,col2,col3
1,Jake,100
2,John,95
3,Joe,87
21,Ema,96
22,Eli,88
23,Erica,100
24,Emily,77
Example 3: Concatenate two CSV files in Linux with awk (including headers)
Third example shows how to merge two and more files by using the command awk
. If all files have headers the command will take only 1 header from one file. The rest of the data will be appended to the end of the file. The command is:
awk 'FNR==1 && NR!=1{next;}{print}' *.csv >> merged.csv
result:
col1,col2,col3
1,Jake,100
2,John,95
3,Joe,87
21,Ema,96
22,Eli,88
23,Erica,100
24,Emily,77
where the condition FNR==1 && NR!=1
reads the first line of the first file and skips the first line for the rest.
If you don't care for the headers than you can change it to:
awk 'FNR==0{next;}{print}' *.csv >> merged.csv
result:
col1,col2,col3
1,Jake,100
2,John,95
3,Joe,87
col1,col2,col3
21,Ema,96
22,Eli,88
23,Erica,100
24,Emily,77