Java IO Introduction

Introduction

Data transfer takes up a significant amount of cpu as well as developer time. If your application reads or writes data, then that is the first place that you should be looking at for performance improvements. Java IO has been around for a long time and has classes that handle most of the tasks for you. Java NIO adds advanced capabilities for an enterprise level application, it also provides methods to handle large files. It is closer to the operating system than java IO and hence provides improvements in performance. However NIO should not be considered an advanced version of java IO, instead it complements java IO in providing methods that help build a more scalable application.

Navigating

There is a link on top left of this page right below the menu bar. Click on the link to bring up the table of contents. A search is also provided in the table of contents to search for topics. You can also navigate to the next or previous tutorial using the buttons provided.

Java IO (com.java.io)

Java IO has been around for a long time and has been used by almost all java developers. There are basically four kinds of things that it can do – read characters, write characters, read bytes and write bytes. The four basic classes are Reader, Writer, OutputStream and InputStream. There are convenience classes to read and write from a file. Java IO makes extensive use of the decorator pattern. Classes that buffer the input and output steam are examples of these. There are classes to read and write primitive types and objects.

Java NIO (com.java.nio)

NIO or New IO has a series of advanced classes that help in building scalable enterprise applications. It delegates some of the method calls to the operating system, thereby utilizing the optimization provided by the file system. These are the capabilities of NIO :

  • Buffers – These are arrays of bytes of various types that make up a lot of classes in NIO. Buffers present ways to read and write data to channels and methods to handle data in a simple way. Converting data from, say a byte buffer to char buffer, is very simple using the buffer classes.
  • Direct Buffers – It is possible to allocate memory directly in the file system instead of the jvm. This vastly improves IO performance. it is possible to wrap a buffer around an area of memory alloted outside the jvm and manage that memory using NIO methods.
  • Memory Mapped Files – Loading large files into jvm may take up a lot of time and may also fail becuase the jvm memory may become full. It is now possible to directly map the huge files without loading them into memory. A buffer is created around the file in the file system without loading the file into jvm. The file can be directly read or written using the mapped buffer. This functionality now enables java to now handle large files.
  • Non Blocking IO and Multiplexed Channel– Traditional IO on sockets is blocking. When the socket server receives a new client connection, it needs to create a new thread to handle that client. Imagine an application that handles thousands of clients. Creating so many threads may overload the server. Java NIO presents the selector framework. Using this framework, channels (channels are the interfaces that transfer data from the buffer to the file or socket) may be selected when they are ready for IO, so instead of creating new threads for socket clients, the socket client channels are selected when data is available for reading or writing. In most cases the ‘selection’ is accomplished by the operating system and it notifies the selector when data is available for read or write. A thread pool may be used to handle the selected channels. However thread from this pools are not blocked since they are used only when data is available and then released immediately.
  • Scatter read and Gather read – Sometimes the files to be read have fixed structures. It is possible to create multiple buffers and read the file direcctly into multiple buffers instead of reading the file sequentially byte by byte. for example, if there are three buffers – bufferA(100bytes), BufferB(200Bytes) and BufferC(300) Bytes, then all 600 bytes of a file can be read in a single step and the bytes are divided into the different buffers respectively
  • Direct Data Transfer– It is possible to directly transfer data from one channel to another. This provides efficient methods for file copying.
Apache Commons IO

Apache commons IO has convenience classes for many IO tasks. For example, it has methods that take in a file and return a String of all the content in the file. Its has classes to traverse a directory using a file filter. These classes can significantly reduce development time and make code easier to read and write.

If you find anything missing in the tutorials then please drop in your comments.

Leave a Comment