reading file line by line in Java with BufferedReader

Reading files in Java is the crusade for a lot of confusion. There are multiple ways of accomplishing the same job and information technology's oft not clear which file reading method is all-time to employ. Something that'southward quick and muddy for a small example file might not exist the best method to utilize when y'all need to read a very big file. Something that worked in an earlier Coffee version, might not exist the preferred method anymore.

This article aims to be the definitive guide for reading files in Java 7, viii and 9. I'm going to cover all the ways yous tin read files in Java. Too often, you lot'll read an article that tells you i way to read a file, just to discover later there are other ways to do that. I'm actually going to cover 15 dissimilar ways to read a file in Coffee. I'm going to encompass reading files in multiple ways with the cadre Coffee libraries as well as two third party libraries.

Simply that's not all – what good is knowing how to practise something in multiple ways if you don't know which way is best for your state of affairs?

I besides put each of these methods to a real performance exam and certificate the results. That way, y'all will accept some hard data to know the performance metrics of each method.

Methodology

JDK Versions

Java lawmaking samples don't alive in isolation, especially when it comes to Coffee I/O, as the API keeps evolving. All code for this article has been tested on:

  • Java SE 7 (jdk1.7.0_80)
  • Java SE 8 (jdk1.8.0_162)
  • Java SE ix (jdk-9.0.four)

When there is an incompatibility, it will be stated in that section. Otherwise, the code works unaltered for different Java versions. The primary incompatibility is the use of lambda expressions which was introduced in Coffee 8.

Java File Reading Libraries

There are multiple ways of reading from files in Java. This article aims to be a comprehensive drove of all the different methods. I will embrace:

  • coffee.io.FileReader.read()
  • java.io.BufferedReader.readLine()
  • java.io.FileInputStream.read()
  • coffee.io.BufferedInputStream.read()
  • java.nio.file.Files.readAllBytes()
  • coffee.nio.file.Files.readAllLines()
  • java.nio.file.Files.lines()
  • java.util.Scanner.nextLine()
  • org.apache.commons.io.FileUtils.readLines() – Apache Commons
  • com.google.common.io.Files.readLines() – Google Guava

Endmost File Resource

Prior to JDK7, when opening a file in Coffee, all file resource would need to be manually closed using a effort-grab-finally block. JDK7 introduced the try-with-resource statement, which simplifies the process of closing streams. You lot no longer demand to write explicit code to close streams considering the JVM will automatically close the stream for you, whether an exception occurred or non. All examples used in this commodity use the endeavor-with-resources statement for importing, loading, parsing and closing files.

File Location

All examples will read examination files from C:\temp.

Encoding

Character encoding is non explicitly saved with text files so Java makes assumptions about the encoding when reading files. Usually, the assumption is correct but sometimes you want to be explicit when instructing your programs to read from files. When encoding isn't correct, you'll see funny characters appear when reading files.

All examples for reading text files utilize ii encoding variations:
Default organization encoding where no encoding is specified and explicitly setting the encoding to UTF-8.

Download Code

All code files are bachelor from Github.

Lawmaking Quality and Code Encapsulation

In that location is a difference between writing lawmaking for your personal or work projection and writing code to explicate and teach concepts.

If I was writing this code for my own project, I would utilise proper object-oriented principles like encapsulation, brainchild, polymorphism, etc. But I wanted to make each instance stand alone and hands understood, which meant that some of the lawmaking has been copied from one example to the next. I did this on purpose because I didn't want the reader to have to figure out all the encapsulation and object structures I so cleverly created. That would take away from the examples.

For the same reason, I chose Non to write these example with a unit testing framework like JUnit or TestNG because that's not the purpose of this commodity. That would add another library for the reader to understand that has nothing to exercise with reading files in Java. That's why all the instance are written inline inside the chief method, without extra methods or classes.

My main purpose is to make the examples every bit easy to understand every bit possible and I believe that having extra unit testing and encapsulation code will not assist with this. That doesn't hateful that'southward how I would encourage you to write your own personal code. It'due south just the way I chose to write the examples in this commodity to make them easier to understand.

Exception Handling

All examples declare whatever checked exceptions in the throwing method declaration.

The purpose of this article is to bear witness all the different ways to read from files in Java – it's not meant to show how to handle exceptions, which will be very specific to your situation.

So instead of creating unhelpful attempt catch blocks that merely print exception stack traces and clutter up the code, all example will declare any checked exception in the calling method. This volition make the code cleaner and easier to understand without sacrificing whatsoever functionality.

Hereafter Updates

As Java file reading evolves, I will be updating this commodity with any required changes.

File Reading Methods

I organized the file reading methods into three groups:

  • Archetype I/O classes that have been part of Coffee since before JDK 1.vii. This includes the coffee.io and java.util packages.
  • New Java I/O classes that have been part of Java since JDK1.7. This covers the coffee.nio.file.Files class.
  • Third party I/O classes from the Apache Commons and Google Guava projects.

Classic I/O – Reading Text

1a) FileReader – Default Encoding

FileReader reads in ane character at a time, without whatever buffering. Information technology's meant for reading text files. Information technology uses the default character encoding on your system, so I have provided examples for both the default example, as well every bit specifying the encoding explicitly.

          

1
2
three
4
v
vi
vii
8
9
10
11
12
13
14
xv
16
17
18
19

import java.io.FileReader ;
import coffee.io.IOException ;

public grade ReadFile_FileReader_Read {
public static void main( String [ ] pArgs) throws IOException {
Cord fileName = "c:\\temp\\sample-10KB.txt" ;

try ( FileReader fileReader = new FileReader (fileName) ) {
int singleCharInt;
char singleChar;
while ( (singleCharInt = fileReader.read ( ) ) != - one ) {
singleChar = ( char ) singleCharInt;

//brandish one graphic symbol at a time
Organisation.out.print (singleChar) ;
}
}
}
}

1b) FileReader – Explicit Encoding (InputStreamReader)

It's actually not possible to set up the encoding explicitly on a FileReader so you have to use the parent grade, InputStreamReader and wrap it around a FileInputStream:

          

1
two
three
4
v
6
7
8
ix
10
11
12
13
fourteen
fifteen
16
17
18
xix
20
21
22

import java.io.FileInputStream ;
import java.io.IOException ;
import java.io.InputStreamReader ;

public class ReadFile_FileReader_Read_Encoding {
public static void main( Cord [ ] pArgs) throws IOException {
Cord fileName = "c:\\temp\\sample-10KB.txt" ;
FileInputStream fileInputStream = new FileInputStream (fileName) ;

//specify UTF-8 encoding explicitly
try ( InputStreamReader inputStreamReader =
new InputStreamReader (fileInputStream, "UTF-viii" ) ) {

int singleCharInt;
char singleChar;
while ( (singleCharInt = inputStreamReader.read ( ) ) != - ane ) {
singleChar = ( char ) singleCharInt;
System.out.print (singleChar) ; //brandish one character at a time
}
}
}
}

2a) BufferedReader – Default Encoding

BufferedReader reads an unabridged line at a time, instead of 1 graphic symbol at a time like FileReader. It's meant for reading text files.

          

1
2
3
4
5
6
vii
8
9
10
11
12
13
14
15
sixteen
17

import java.io.BufferedReader ;
import java.io.FileReader ;
import java.io.IOException ;

public class ReadFile_BufferedReader_ReadLine {
public static void master( Cord [ ] args) throws IOException {
String fileName = "c:\\temp\\sample-10KB.txt" ;
FileReader fileReader = new FileReader (fileName) ;

endeavor ( BufferedReader bufferedReader = new BufferedReader (fileReader) ) {
String line;
while ( (line = bufferedReader.readLine ( ) ) != zero ) {
Organization.out.println (line) ;
}
}
}
}

2b) BufferedReader – Explicit Encoding

In a like manner to how we ready encoding explicitly for FileReader, we need to create FileInputStream, wrap it inside InputStreamReader with an explicit encoding and laissez passer that to BufferedReader:

          

1
2
iii
4
five
vi
7
eight
9
x
11
12
13
14
15
16
17
xviii
xix
20
21
22

import coffee.io.BufferedReader ;
import java.io.FileInputStream ;
import java.io.IOException ;
import java.io.InputStreamReader ;

public class ReadFile_BufferedReader_ReadLine_Encoding {
public static void main( Cord [ ] args) throws IOException {
String fileName = "c:\\temp\\sample-10KB.txt" ;

FileInputStream fileInputStream = new FileInputStream (fileName) ;

//specify UTF-8 encoding explicitly
InputStreamReader inputStreamReader = new InputStreamReader (fileInputStream, "UTF-8" ) ;

try ( BufferedReader bufferedReader = new BufferedReader (inputStreamReader) ) {
String line;
while ( (line = bufferedReader.readLine ( ) ) != zilch ) {
System.out.println (line) ;
}
}
}
}

Archetype I/O – Reading Bytes

one) FileInputStream

FileInputStream reads in one byte at a time, without any buffering. While it's meant for reading binary files such equally images or audio files, information technology can still be used to read text file. Information technology's like to reading with FileReader in that you're reading 1 character at a time as an integer and you need to cast that int to a char to see the ASCII value.

By default, it uses the default character encoding on your organisation, then I have provided examples for both the default instance, as well every bit specifying the encoding explicitly.

          

1
2
3
4
five
6
7
8
9
ten
11
12
13
14
15
16
17
18
xix
20
21

import coffee.io.File ;
import java.io.FileInputStream ;
import java.io.FileNotFoundException ;
import java.io.IOException ;

public class ReadFile_FileInputStream_Read {
public static void main( Cord [ ] pArgs) throws FileNotFoundException, IOException {
String fileName = "c:\\temp\\sample-10KB.txt" ;
File file = new File (fileName) ;

try ( FileInputStream fileInputStream = new FileInputStream (file) ) {
int singleCharInt;
char singleChar;

while ( (singleCharInt = fileInputStream.read ( ) ) != - 1 ) {
singleChar = ( char ) singleCharInt;
System.out.print (singleChar) ;
}
}
}
}

2) BufferedInputStream

BufferedInputStream reads a prepare of bytes all at once into an internal byte array buffer. The buffer size can exist set explicitly or use the default, which is what we'll demonstrate in our example. The default buffer size appears to exist 8KB but I have non explicitly verified this. All performance tests used the default buffer size so it volition automatically re-size the buffer when information technology needs to.

          

ane
2
3
four
v
six
7
eight
9
10
11
12
13
14
fifteen
16
17
18
nineteen
20
21
22

import java.io.BufferedInputStream ;
import java.io.File ;
import java.io.FileInputStream ;
import java.io.FileNotFoundException ;
import coffee.io.IOException ;

public class ReadFile_BufferedInputStream_Read {
public static void main( Cord [ ] pArgs) throws FileNotFoundException, IOException {
String fileName = "c:\\temp\\sample-10KB.txt" ;
File file = new File (fileName) ;
FileInputStream fileInputStream = new FileInputStream (file) ;

try ( BufferedInputStream bufferedInputStream = new BufferedInputStream (fileInputStream) ) {
int singleCharInt;
char singleChar;
while ( (singleCharInt = bufferedInputStream.read ( ) ) != - 1 ) {
singleChar = ( char ) singleCharInt;
System.out.print (singleChar) ;
}
}
}
}

New I/O – Reading Text

1a) Files.readAllLines() – Default Encoding

The Files class is part of the new Coffee I/O classes introduced in jdk1.7. It only has static utility methods for working with files and directories.

The readAllLines() method that uses the default character encoding was introduced in jdk1.eight then this example will non work in Java 7.

          

1
2
3
4
5
6
7
8
nine
ten
eleven
12
13
14
15
16
17

import java.io.File ;
import java.io.IOException ;
import coffee.nio.file.Files ;
import coffee.util.List ;

public course ReadFile_Files_ReadAllLines {
public static void main( String [ ] pArgs) throws IOException {
String fileName = "c:\\temp\\sample-10KB.txt" ;
File file = new File (fileName) ;

List fileLinesList = Files.readAllLines (file.toPath ( ) ) ;

for ( String line : fileLinesList) {
System.out.println (line) ;
}
}
}

1b) Files.readAllLines() – Explicit Encoding

          

ane
2
3
four
5
vi
vii
viii
9
x
11
12
thirteen
14
fifteen
16
17
18
19

import java.io.File ;
import java.io.IOException ;
import coffee.nio.charset.StandardCharsets ;
import java.nio.file.Files ;
import java.util.List ;

public form ReadFile_Files_ReadAllLines_Encoding {
public static void main( Cord [ ] pArgs) throws IOException {
Cord fileName = "c:\\temp\\sample-10KB.txt" ;
File file = new File (fileName) ;

//employ UTF-8 encoding
List fileLinesList = Files.readAllLines (file.toPath ( ), StandardCharsets.UTF_8 ) ;

for ( String line : fileLinesList) {
System.out.println (line) ;
}
}
}

2a) Files.lines() – Default Encoding

This code was tested to piece of work in Coffee 8 and 9. Java vii didn't run because of the lack of support for lambda expressions.

          

1
2
3
4
five
6
7
viii
9
10
11
12
13
14
15
16
17

import coffee.io.File ;
import java.io.IOException ;
import coffee.nio.file.Files ;
import java.util.stream.Stream ;

public class ReadFile_Files_Lines {
public static void main( String [ ] pArgs) throws IOException {
String fileName = "c:\\temp\\sample-10KB.txt" ;
File file = new File (fileName) ;

try (Stream linesStream = Files.lines (file.toPath ( ) ) ) {
linesStream.forEach (line -> {
System.out.println (line) ;
} ) ;
}
}
}

2b) Files.lines() – Explicit Encoding

Just similar in the previous example, this code was tested and works in Java 8 and 9 but not in Coffee 7.

          

1
2
3
4
5
six
7
8
nine
10
11
12
13
14
15
sixteen
17
xviii

import java.io.File ;
import java.io.IOException ;
import java.nio.charset.StandardCharsets ;
import coffee.nio.file.Files ;
import coffee.util.stream.Stream ;

public form ReadFile_Files_Lines_Encoding {
public static void principal( Cord [ ] pArgs) throws IOException {
String fileName = "c:\\temp\\sample-10KB.txt" ;
File file = new File (fileName) ;

try (Stream linesStream = Files.lines (file.toPath ( ), StandardCharsets.UTF_8 ) ) {
linesStream.forEach (line -> {
Arrangement.out.println (line) ;
} ) ;
}
}
}

3a) Scanner – Default Encoding

The Scanner class was introduced in jdk1.7 and tin exist used to read from files or from the panel (user input).

          

1
2
iii
four
5
half dozen
7
viii
nine
10
xi
12
13
14
15
sixteen
17
18
19

import java.io.File ;
import java.io.FileNotFoundException ;
import java.util.Scanner ;

public course ReadFile_Scanner_NextLine {
public static void main( String [ ] pArgs) throws FileNotFoundException {
Cord fileName = "c:\\temp\\sample-10KB.txt" ;
File file = new File (fileName) ;

try (Scanner scanner = new Scanner(file) ) {
String line;
boolean hasNextLine = faux ;
while (hasNextLine = scanner.hasNextLine ( ) ) {
line = scanner.nextLine ( ) ;
System.out.println (line) ;
}
}
}
}

3b) Scanner – Explicit Encoding

          

one
two
3
4
5
6
vii
8
9
10
11
12
13
14
fifteen
16
17
eighteen
xix
20

import java.io.File ;
import coffee.io.FileNotFoundException ;
import coffee.util.Scanner ;

public class ReadFile_Scanner_NextLine_Encoding {
public static void primary( String [ ] pArgs) throws FileNotFoundException {
String fileName = "c:\\temp\\sample-10KB.txt" ;
File file = new File (fileName) ;

//use UTF-eight encoding
try (Scanner scanner = new Scanner(file, "UTF-8" ) ) {
String line;
boolean hasNextLine = faux ;
while (hasNextLine = scanner.hasNextLine ( ) ) {
line = scanner.nextLine ( ) ;
System.out.println (line) ;
}
}
}
}

New I/O – Reading Bytes

Files.readAllBytes()

Even though the documentation for this method states that "it is not intended for reading in big files" I establish this to be the absolute best performing file reading method, fifty-fifty on files every bit large as 1GB.

          

1
2
iii
4
5
6
7
eight
9
10
eleven
12
thirteen
14
15
xvi
17

import java.io.File ;
import coffee.io.IOException ;
import java.nio.file.Files ;

public class ReadFile_Files_ReadAllBytes {
public static void principal( String [ ] pArgs) throws IOException {
String fileName = "c:\\temp\\sample-10KB.txt" ;
File file = new File (fileName) ;

byte [ ] fileBytes = Files.readAllBytes (file.toPath ( ) ) ;
char singleChar;
for ( byte b : fileBytes) {
singleChar = ( char ) b;
Organisation.out.print (singleChar) ;
}
}
}

third Party I/O – Reading Text

Commons – FileUtils.readLines()

Apache Commons IO is an open source Coffee library that comes with utility classes for reading and writing text and binary files. I listed it in this commodity considering it can exist used instead of the built in Coffee libraries. The class we're using is FileUtils.

For this article, version 2.6 was used which is compatible with JDK 1.seven+

Notation that you demand to explicitly specify the encoding and that method for using the default encoding has been deprecated.

          

one
2
3
four
v
6
vii
viii
ix
10
xi
12
thirteen
14
fifteen
16
17
18

import java.io.File ;
import java.io.IOException ;
import java.util.List ;

import org.apache.commons.io.FileUtils ;

public form ReadFile_Commons_FileUtils_ReadLines {
public static void main( String [ ] pArgs) throws IOException {
String fileName = "c:\\temp\\sample-10KB.txt" ;
File file = new File (fileName) ;

List fileLinesList = FileUtils.readLines (file, "UTF-8" ) ;

for ( String line : fileLinesList) {
System.out.println (line) ;
}
}
}

Guava – Files.readLines()

Google Guava is an open source library that comes with utility classes for common tasks like collections handling, enshroud direction, IO operations, string processing.

I listed information technology in this commodity because information technology can be used instead of the congenital in Coffee libraries and I wanted to compare its performance with the Coffee built in libraries.

For this commodity, version 23.0 was used.

I'one thousand not going to examine all the unlike ways to read files with Guava, since this article is not meant for that. For a more detailed await at all the different means to read and write files with Guava, have a await at Baeldung'southward in depth article.

When reading a file, Guava requires that the character encoding be fix explicitly, just similar Apache Commons.

Compatibility note: This code was tested successfully on Java 8 and ix. I couldn't get it to piece of work on Coffee 7 and kept getting "Unsupported major.pocket-size version 52.0" error. Guava has a separate API doc for Java seven which uses a slightly unlike version of the Files.readLine() method. I thought I could get information technology to work but I kept getting that error.

          

1
2
3
4
5
six
7
8
9
10
11
12
13
fourteen
xv
16
17
18
xix

import java.io.File ;
import java.io.IOException ;
import java.util.List ;

import com.google.common.base.Charsets ;
import com.google.common.io.Files ;

public class ReadFile_Guava_Files_ReadLines {
public static void main( String [ ] args) throws IOException {
String fileName = "c:\\temp\\sample-10KB.txt" ;
File file = new File (fileName) ;

List fileLinesList = Files.readLines (file, Charsets.UTF_8 ) ;

for ( String line : fileLinesList) {
System.out.println (line) ;
}
}
}

Functioning Testing

Since there are then many ways to read from a file in Java, a natural question is "What file reading method is the best for my situation?" So I decided to exam each of these methods against each other using sample data files of different sizes and timing the results.

Each code sample from this commodity displays the contents of the file to a cord and and then to the console (Organisation.out). All the same, during the functioning tests the Arrangement.out line was commented out since it would seriously slow down the performance of each method.

Each performance test measures the fourth dimension information technology takes to read in the file – line past line, character by character, or byte by byte without displaying anything to the console. I ran each test 5-ten times and took the average and then as non to let any outliers influence each test. I likewise ran the default encoding version of each file reading method – i.e. I didn't specify the encoding explicitly.

Dev Setup

The dev environs used for these tests:

  • Intel Core i7-3615 QM @2.3 GHz, 8GB RAM
  • Windows eight x64
  • Eclipse IDE for Coffee Developers, Oxygen.ii Release (4.vii.2)
  • Java SE ix (jdk-9.0.4)

Data Files

GitHub doesn't allow pushing files larger than 100 MB, so I couldn't find a practical way to store my big exam files to allow others to replicate my tests. Then instead of storing them, I'one thousand providing the tools I used to generate them so you can create test files that are similar in size to mine. Obviously they won't be the same, but yous'll generate files that are similar in size as I used in my performance tests.

Random Cord Generator was used to generate sample text and so I just copy-pasted to create larger versions of the file. When the file started getting too large to manage inside a text editor, I had to utilise the command line to merge multiple text files into a larger text file:

copy *.txt sample-1GB.txt

I created the following 7 data file sizes to exam each file reading method beyond a range of file sizes:

  • 1KB
  • 10KB
  • 100KB
  • 1MB
  • 10MB
  • 100MB
  • 1GB

Performance Summary

In that location were some surprises and some expected results from the performance tests.

As expected, the worst performers were the methods that read in a file character past grapheme or byte by byte. Merely what surprised me was that the native Coffee IO libraries outperformed both tertiary party libraries – Apache Eatables IO and Google Guava.

What'due south more – both Google Guava and Apache Commons IO threw a java.lang.OutOfMemoryError when trying to read in the one GB test file. This likewise happened with the Files.readAllLines(Path) method but the remaining 7 methods were able to read in all exam files, including the 1GB examination file.

The following table summarizes the boilerplate time (in milliseconds) each file reading method took to complete. I highlighted the elevation three methods in green, the average performing methods in yellowish and the worst performing methods in red:

The following chart summarizes the higher up table simply with the following changes:

I removed coffee.io.FileInputStream.read() from the chart because its performance was so bad it would skew the entire chart and you wouldn't run into the other lines properly
I summarized the data from 1KB to 1MB considering later on that, the chart would go likewise skewed with so many under performers and as well some methods threw a java.lang.OutOfMemoryError at 1GB

The Winners

The new Java I/O libraries (java.nio) had the best overall winner (java.nio.Files.readAllBytes()) but it was followed closely behind by BufferedReader.readLine() which was as well a proven pinnacle performer across the board. The other excellent performer was java.nio.Files.lines(Path) which had slightly worse numbers for smaller test files simply really excelled with the larger examination files.

The absolute fastest file reader across all data tests was coffee.nio.Files.readAllBytes(Path). Information technology was consistently the fastest and fifty-fifty reading a 1GB file only took most 1 second.

The following chart compares performance for a 100KB test file:

You can meet that the lowest times were for Files.readAllBytes(), BufferedInputStream.read() and BufferedReader.readLine().

The following nautical chart compares functioning for reading a 10MB file. I didn't bother including the bar for FileInputStream.Read() considering the performance was so bad it would skew the entire nautical chart and yous couldn't tell how the other methods performed relative to each other:

Files.readAllBytes() actually outperforms all other methods and BufferedReader.readLine() is a distant 2nd.

The Losers

Every bit expected, the absolute worst performer was java.io.FileInputStream.read() which was orders of magnitude slower than its rivals for nearly tests. FileReader.read() was also a poor performer for the same reason – reading files byte by byte (or character by character) instead of with buffers drastically degrades performance.

Both the Apache Commons IO FileUtils.readLines() and Guava Files.readLines() crashed with an OutOfMemoryError when trying to read the 1GB test file and they were about average in performance for the remaining test files.

java.nio.Files.readAllLines() too crashed when trying to read the 1GB exam file simply it performed quite well for smaller file sizes.

Performance Rankings

Hither's a ranked list of how well each file reading method did, in terms of speed and treatment of large files, besides as compatibility with different Coffee versions.

Rank File Reading Method
1 java.nio.file.Files.readAllBytes()
2 java.io.BufferedFileReader.readLine()
3 java.nio.file.Files.lines()
4 java.io.BufferedInputStream.read()
5 java.util.Scanner.nextLine()
6 java.nio.file.Files.readAllLines()
7 org.apache.commons.io.FileUtils.readLines()
8 com.google.common.io.Files.readLines()
9 java.io.FileReader.read()
ten java.io.FileInputStream.Read()

Conclusion

I tried to present a comprehensive set of methods for reading files in Java, both text and binary. We looked at fifteen different ways of reading files in Java and nosotros ran performance tests to come across which methods are the fastest.

The new Coffee IO library (java.nio) proved to exist a great performer but so was the archetype BufferedReader.