Java CSV Parser With Examples

Welcome to the Java CSV Parser tutorial. CSV files are one of the most widely used format to pass data from one system to another. Since CSV files are supported in Microsoft Excel, it can be easily used by non-techies also.

Java CSV Parser

Unfortunately, we don’t have any in-built Java CSV Parser.

If the CSV file is really simple and don’t have any special characters, then we can use Java Scanner class to parse CSV files but most of the times it’s not the case. Rather than writing complicated logic for parsing, it’s better to use open-source tools we have for parsing and writing CSV files.

There are three open-source APIs for working with CSV.

  1. OpenCSV
  2. Apache Commons CSV
  3. Super CSV

We will look into all these java CSV parsers one by one.

Suppose we have a CSV file as:

employees.csv


ID,Name,Role,Salary
1,Pankaj Kumar,CEO,"5,000USD"
2,Lisa,Manager,500USD
3,David,,1000USD

and we want to parse it to list of Employee object.


package com.journaldev.parser.csv;
public class Employee {
	private String id;
	private String name;
	private String role;
	private String salary;
	public String getId() {
		return id;
	}
	public void setId(String id) {
		this.id = id;
	}
	public String getName() {
		return name;
	}
	public void setName(String name) {
		this.name = name;
	}
	public String getRole() {
		return role;
	}
	public void setRole(String role) {
		this.role = role;
	}
	public String getSalary() {
		return salary;
	}
	public void setSalary(String salary) {
		this.salary = salary;
	}
	@Override
	public String toString(){
		return "ID="+id+",Name="+name+",Role="+role+",Salary="+salary+"n";
	}
}

1. OpenCSV

We will see how we can use OpenCSV java parser to read CSV file to java object and then write CSV from java object. Download OpenCSV libraries from SourceForge Website and include it in the classpath.

If you are using Maven then include it with below dependency.


<dependency>
    <groupId>com.opencsv</groupId>
    <artifactId>opencsv</artifactId>
    <version>3.8</version>
</dependency>

For parsing CSV file we can use CSVReader to parse each row to the list of objects. CSVParser also provides an option to read all the data at once and then parse it.

OpenCSV provides CsvToBean class that we can use with HeaderColumnNameMappingStrategy object to automatically map the CSV to list of objects.

For writing CSV data, we need to create List of String array and then use CSVWriter class to write it to the file or any other writer object.


package com.journaldev.parser.csv;
import java.io.FileReader;
import java.io.IOException;
import java.io.StringWriter;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.Iterator;
import java.util.List;
import java.util.Map;
import au.com.bytecode.opencsv.CSVReader;
import au.com.bytecode.opencsv.CSVWriter;
import au.com.bytecode.opencsv.bean.CsvToBean;
import au.com.bytecode.opencsv.bean.HeaderColumnNameTranslateMappingStrategy;
public class OpenCSVParserExample {
	public static void main(String[] args) throws IOException {
		List<Employee> emps = parseCSVFileLineByLine();
		System.out.println("**********");
		parseCSVFileAsList();
		System.out.println("**********");
		parseCSVToBeanList();
		System.out.println("**********");
		writeCSVData(emps);
	}
	private static void parseCSVToBeanList() throws IOException {
		HeaderColumnNameTranslateMappingStrategy<Employee> beanStrategy = new HeaderColumnNameTranslateMappingStrategy<Employee>();
		beanStrategy.setType(Employee.class);
		Map<String, String> columnMapping = new HashMap<String, String>();
		columnMapping.put("ID", "id");
		columnMapping.put("Name", "name");
		columnMapping.put("Role", "role");
		//columnMapping.put("Salary", "salary");
		beanStrategy.setColumnMapping(columnMapping);
		CsvToBean<Employee> csvToBean = new CsvToBean<Employee>();
		CSVReader reader = new CSVReader(new FileReader("employees.csv"));
		List<Employee> emps = csvToBean.parse(beanStrategy, reader);
		System.out.println(emps);
	}
	private static void writeCSVData(List<Employee> emps) throws IOException {
		StringWriter writer = new StringWriter();
		CSVWriter csvWriter = new CSVWriter(writer,"https://www.journaldev.com/2544/#");
		List<String[]> data  = toStringArray(emps);
		csvWriter.writeAll(data);
		csvWriter.close();
		System.out.println(writer);
	}
	private static List<String[]> toStringArray(List<Employee> emps) {
		List<String[]> records = new ArrayList<String[]>();
		//add header record
		records.add(new String[]{"ID","Name","Role","Salary"});
		Iterator<Employee> it = emps.iterator();
		while(it.hasNext()){
			Employee emp = it.next();
			records.add(new String[]{emp.getId(),emp.getName(),emp.getRole(),emp.getSalary()});
		}
		return records;
	}
	private static List<Employee> parseCSVFileLineByLine() throws IOException {
		//create CSVReader object
		CSVReader reader = new CSVReader(new FileReader("employees.csv"), ',');
		List<Employee> emps = new ArrayList<Employee>();
		//read line by line
		String[] record = null;
		//skip header row
		reader.readNext();
		while((record = reader.readNext()) != null){
			Employee emp = new Employee();
			emp.setId(record[0]);
			emp.setName(record[1]);
			emp.setRole(record[2]);
			emp.setSalary(record[3]);
			emps.add(emp);
		}
		reader.close();
		System.out.println(emps);
		return emps;
	}
	private static void parseCSVFileAsList() throws IOException {
		//create CSVReader object
		CSVReader reader = new CSVReader(new FileReader("employees.csv"), ',');
		List<Employee> emps = new ArrayList<Employee>();
		//read all lines at once
		List<String[]> records = reader.readAll();
		Iterator<String[]> iterator = records.iterator();
		//skip header row
		iterator.next();
		while(iterator.hasNext()){
			String[] record = iterator.next();
			Employee emp = new Employee();
			emp.setId(record[0]);
			emp.setName(record[1]);
			emp.setRole(record[2]);
			emp.setSalary(record[3]);
			emps.add(emp);
		}
		reader.close();
		System.out.println(emps);
	}
}

When we run above OpenCSV example program, we get the following output.


[ID=1,Name=Pankaj Kumar,Role=CEO,Salary=5,000USD
, ID=2,Name=Lisa,Role=Manager,Salary=500USD
, ID=3,Name=David,Role=,Salary=1000USD
]
**********
[ID=1,Name=Pankaj Kumar,Role=CEO,Salary=5,000USD
, ID=2,Name=Lisa,Role=Manager,Salary=500USD
, ID=3,Name=David,Role=,Salary=1000USD
]
**********
[ID=1,Name=Pankaj Kumar,Role=CEO,Salary=null
, ID=2,Name=Lisa,Role=Manager,Salary=null
, ID=3,Name=David,Role=,Salary=null
]
**********
"ID"https://www.journaldev.com/2544/#"Name"https://www.journaldev.com/2544/#"Role"https://www.journaldev.com/2544/#"Salary"
"1"https://www.journaldev.com/2544/#"Pankaj Kumar"https://www.journaldev.com/2544/#"CEO"https://www.journaldev.com/2544/#"5,000USD"
"2"https://www.journaldev.com/2544/#"Lisa"https://www.journaldev.com/2544/#"Manager"https://www.journaldev.com/2544/#"500USD"
"3"https://www.journaldev.com/2544/#"David"https://www.journaldev.com/2544/#""https://www.journaldev.com/2544/#"1000USD"

As you can see that we can set the delimiters character also while parsing or writing CSV data in OpenCSV java parser.

2. Apache Commmons CSV

You can download the Apache Commons CSV binaries or include the dependencies using maven as shown below.


<dependency>
    <groupId>org.apache.commons</groupId>
    <artifactId>commons-csv</artifactId>
    <version>1.3</version>
</dependency>

Apache Commons CSV parser is simple to use and CSVParser class is used to parse the CSV data and CSVPrinter is used to write the data.

Example code to parse above CSV file to the list of Employee objects is given below.


package com.journaldev.parser.csv;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import org.apache.commons.csv.CSVFormat;
import org.apache.commons.csv.CSVParser;
import org.apache.commons.csv.CSVPrinter;
import org.apache.commons.csv.CSVRecord;
public class ApacheCommonsCSVParserExample {
	public static void main(String[] args) throws FileNotFoundException, IOException {
		//Create the CSVFormat object
		CSVFormat format = CSVFormat.RFC4180.withHeader().withDelimiter(',');
		//initialize the CSVParser object
		CSVParser parser = new CSVParser(new FileReader("employees.csv"), format);
		List<Employee> emps = new ArrayList<Employee>();
		for(CSVRecord record : parser){
			Employee emp = new Employee();
			emp.setId(record.get("ID"));
			emp.setName(record.get("Name"));
			emp.setRole(record.get("Role"));
			emp.setSalary(record.get("Salary"));
			emps.add(emp);
		}
		//close the parser
		parser.close();
		System.out.println(emps);
		//CSV Write Example using CSVPrinter
		CSVPrinter printer = new CSVPrinter(System.out, format.withDelimiter("https://www.journaldev.com/2544/#"));
		System.out.println("********");
		printer.printRecord("ID","Name","Role","Salary");
		for(Employee emp : emps){
			List<String> empData = new ArrayList<String>();
			empData.add(emp.getId());
			empData.add(emp.getName());
			empData.add(emp.getRole());
			empData.add(emp.getSalary());
			printer.printRecord(empData);
		}
		//close the printer
		printer.close();
	}
}

When we run the above program, we get the following output.


[ID=1,Name=Pankaj Kumar,Role=CEO,Salary=5,000USD
, ID=2,Name=Lisa,Role=Manager,Salary=500USD
, ID=3,Name=David,Role=,Salary=1000USD
]
********
ID#Name#Role#Salary
1#Pankaj Kumar#CEO#5,000USD
2#Lisa#Manager#500USD
3#David##1000USD

3. Super CSV

While searching for good CSV parsers, I saw so many developers recommending Super CSV in Stack Overflow. So I thought to give it a try. Download Super CSV libraries from SourceForge Website and include the jar file in the project build path.

If you are using Maven, just add below dependency.


<dependency>
    <groupId>net.sf.supercsv</groupId>
    <artifactId>super-csv</artifactId>
    <version>2.4.0</version>
</dependency>

For parsing CSV file to list of objects, we need to create instance of CsvBeanReader. We can set cell specific rules using CellProcessor array. We can use it to read directly from CSV file to java bean and vice versa.

If we have to write CSV data, process is similar and we have to use CsvBeanWriter class.


package com.journaldev.parser.csv;
import java.io.FileReader;
import java.io.IOException;
import java.io.StringWriter;
import java.util.ArrayList;
import java.util.List;
import org.supercsv.cellprocessor.Optional;
import org.supercsv.cellprocessor.constraint.NotNull;
import org.supercsv.cellprocessor.constraint.UniqueHashCode;
import org.supercsv.cellprocessor.ift.CellProcessor;
import org.supercsv.io.CsvBeanReader;
import org.supercsv.io.CsvBeanWriter;
import org.supercsv.io.ICsvBeanReader;
import org.supercsv.io.ICsvBeanWriter;
import org.supercsv.prefs.CsvPreference;
public class SuperCSVParserExample {
	public static void main(String[] args) throws IOException {
		List<Employee> emps = readCSVToBean();
		System.out.println(emps);
		System.out.println("******");
		writeCSVData(emps);
	}
	private static void writeCSVData(List<Employee> emps) throws IOException {
		ICsvBeanWriter beanWriter = null;
		StringWriter writer = new StringWriter();
		try{
			beanWriter = new CsvBeanWriter(writer, CsvPreference.STANDARD_PREFERENCE);
			final String[] header = new String[]{"id","name","role","salary"};
			final CellProcessor[] processors = getProcessors();
			// write the header
            beanWriter.writeHeader(header);
            //write the bean's data
            for(Employee emp: emps){
            	beanWriter.write(emp, header, processors);
            }
		}finally{
			if( beanWriter != null ) {
                beanWriter.close();
			}
		}
		System.out.println("CSV Datan"+writer.toString());
	}
	private static List<Employee> readCSVToBean() throws IOException {
		ICsvBeanReader beanReader = null;
		List<Employee> emps = new ArrayList<Employee>();
		try {
			beanReader = new CsvBeanReader(new FileReader("employees.csv"),
					CsvPreference.STANDARD_PREFERENCE);
			// the name mapping provide the basis for bean setters
			final String[] nameMapping = new String[]{"id","name","role","salary"};
			//just read the header, so that it don't get mapped to Employee object
			final String[] header = beanReader.getHeader(true);
			final CellProcessor[] processors = getProcessors();
			Employee emp;
			while ((emp = beanReader.read(Employee.class, nameMapping,
					processors)) != null) {
				emps.add(emp);
			}
		} finally {
			if (beanReader != null) {
				beanReader.close();
			}
		}
		return emps;
	}
	private static CellProcessor[] getProcessors() {
		final CellProcessor[] processors = new CellProcessor[] {
                new UniqueHashCode(), // ID (must be unique)
                new NotNull(), // Name
                new Optional(), // Role
                new NotNull() // Salary
        };
		return processors;
	}
}

When we run above Super CSV example program, we get below output.


[ID=1,Name=Pankaj Kumar,Role=CEO,Salary=5,000USD
, ID=2,Name=Lisa,Role=Manager,Salary=500USD
, ID=3,Name=David,Role=null,Salary=1000USD
]
******
CSV Data
id,name,role,salary
1,Pankaj Kumar,CEO,"5,000USD"
2,Lisa,Manager,500USD
3,David,,1000USD

As you can see that the Role field is set as Optional because for the third row, it’s empty. Now if we change that to NotNull, we get following exception.


Exception in thread "main" org.supercsv.exception.SuperCsvConstraintViolationException: null value encountered
processor=org.supercsv.cellprocessor.constraint.NotNull
context={lineNo=4, rowNo=4, columnNo=3, rowSource=[3, David, null, 1000USD]}
	at org.supercsv.cellprocessor.constraint.NotNull.execute(NotNull.java:71)
	at org.supercsv.util.Util.executeCellProcessors(Util.java:93)
	at org.supercsv.io.AbstractCsvReader.executeProcessors(AbstractCsvReader.java:203)
	at org.supercsv.io.CsvBeanReader.read(CsvBeanReader.java:206)
	at com.journaldev.parser.csv.SuperCSVParserExample.readCSVToBean(SuperCSVParserExample.java:66)
	at com.journaldev.parser.csv.SuperCSVParserExample.main(SuperCSVParserExample.java:23)

So SuperCSV provides us option to have conditional logic for the fields that are not available with other CSV parsers. It’s easy to use and the learning curve is also very small.

That’s all for the Java CSV parser example tutorial. Whether to use OpenCSV, Apache Commons CSV or Super CSV depends on your requirement and they all seem to be easy to use.

By admin

Leave a Reply

%d bloggers like this: