Class ArffLoader.ArffReader

java.lang.Object
weka.core.converters.ArffLoader.ArffReader
All Implemented Interfaces:
RevisionHandler
Enclosing class:
ArffLoader

public static class ArffLoader.ArffReader extends Object implements RevisionHandler
Reads data from an ARFF file, either in incremental or batch mode.

Typical code for batch usage:

 BufferedReader reader =
   new BufferedReader(new FileReader("/some/where/file.arff"));
 ArffReader arff = new ArffReader(reader);
 Instances data = arff.getData();
 data.setClassIndex(data.numAttributes() - 1);
 
Typical code for incremental usage:
 BufferedReader reader =
   new BufferedReader(new FileReader("/some/where/file.arff"));
 ArffReader arff = new ArffReader(reader, 1000);
 Instances data = arff.getStructure();
 data.setClassIndex(data.numAttributes() - 1);
 Instance inst;
 while ((inst = arff.readInstance(data)) != null) {
   data.add(inst);
 }
 
Version:
$Revision: 11137 $
Author:
Eibe Frank (eibe@cs.waikato.ac.nz), Len Trigg (trigg@cs.waikato.ac.nz), fracpete (fracpete at waikato dot ac dot nz)
  • Constructor Details

    • ArffReader

      public ArffReader(Reader reader) throws IOException
      Reads the data completely from the reader. The data can be accessed via the getData() method.
      Parameters:
      reader - the reader to use
      Throws:
      IOException - if something goes wrong
      See Also:
    • ArffReader

      public ArffReader(Reader reader, int capacity) throws IOException
      Reads only the header and reserves the specified space for instances. Further instances can be read via readInstance().
      Parameters:
      reader - the reader to use
      capacity - the capacity of the new dataset
      Throws:
      IOException - if something goes wrong
      IllegalArgumentException - if capacity is negative
      See Also:
    • ArffReader

      public ArffReader(Reader reader, Instances template, int lines) throws IOException
      Reads the data without header according to the specified template. The data can be accessed via the getData() method.
      Parameters:
      reader - the reader to use
      template - the template header
      lines - the lines read so far
      Throws:
      IOException - if something goes wrong
      See Also:
    • ArffReader

      public ArffReader(Reader reader, Instances template, int lines, int capacity) throws IOException
      Initializes the reader without reading the header according to the specified template. The data must be read via the readInstance() method.
      Parameters:
      reader - the reader to use
      template - the template header
      lines - the lines read so far
      capacity - the capacity of the new dataset
      Throws:
      IOException - if something goes wrong
      See Also:
  • Method Details

    • getLineNo

      public int getLineNo()
      returns the current line number
      Returns:
      the current line number
    • readInstance

      public Instance readInstance(Instances structure) throws IOException
      Reads a single instance using the tokenizer and returns it.
      Parameters:
      structure - the dataset header information, will get updated in case of string or relational attributes
      Returns:
      null if end of file has been reached
      Throws:
      IOException - if the information is not read successfully
    • readInstance

      public Instance readInstance(Instances structure, boolean flag) throws IOException
      Reads a single instance using the tokenizer and returns it.
      Parameters:
      structure - the dataset header information, will get updated in case of string or relational attributes
      flag - if method should test for carriage return after each instance
      Returns:
      null if end of file has been reached
      Throws:
      IOException - if the information is not read successfully
    • getStructure

      public Instances getStructure()
      Returns the header format
      Returns:
      the header format
    • getData

      public Instances getData()
      Returns the data that was read
      Returns:
      the data
    • getRevision

      public String getRevision()
      Returns the revision string.
      Specified by:
      getRevision in interface RevisionHandler
      Returns:
      the revision