Package weka.core

Class SparseInstance

All Implemented Interfaces:
Serializable, Copyable, Instance, RevisionHandler
Direct Known Subclasses:
BinarySparseInstance

public class SparseInstance extends AbstractInstance
Class for storing an instance as a sparse vector. A sparse instance only requires storage for those attribute values that are non-zero. Since the objective is to reduce storage requirements for datasets with large numbers of default values, this also includes nominal attributes -- the first nominal value (i.e. that which has index 0) will not require explicit storage, so rearrange your nominal attribute value orderings if necessary. Missing values will be stored explicitly.
Version:
$Revision: 15069 $
Author:
Eibe Frank
See Also:
  • Constructor Details

    • SparseInstance

      public SparseInstance(Instance instance)
      Constructor that generates a sparse instance from the given instance. Reference to the dataset is set to null. (ie. the instance doesn't have access to information about the attribute types)
      Parameters:
      instance - the instance from which the attribute values and the weight are to be copied
    • SparseInstance

      public SparseInstance(SparseInstance instance)
      Constructor that copies the info from the given instance. Reference to the dataset is set to null. (ie. the instance doesn't have access to information about the attribute types)
      Parameters:
      instance - the instance from which the attribute info is to be copied
    • SparseInstance

      public SparseInstance(double weight, double[] attValues)
      Constructor that generates a sparse instance from the given parameters. Reference to the dataset is set to null. (ie. the instance doesn't have access to information about the attribute types)
      Parameters:
      weight - the instance's weight
      attValues - a vector of attribute values
    • SparseInstance

      public SparseInstance(double weight, double[] attValues, int[] indices, int maxNumValues)
      Constructor that initializes instance variable with given values. Reference to the dataset is set to null. (ie. the instance doesn't have access to information about the attribute types) Note that the indices need to be sorted in ascending order. Otherwise things won't work properly.
      Parameters:
      weight - the instance's weight
      attValues - a vector of attribute values (just the ones to be stored)
      indices - the indices of the given values in the full vector (need to be sorted in ascending order)
      maxNumValues - the maximum number of values that can be stored
    • SparseInstance

      public SparseInstance(int numAttributes)
      Constructor of an instance that sets weight to one, all values to be missing, and the reference to the dataset to null. (ie. the instance doesn't have access to information about the attribute types)
      Parameters:
      numAttributes - the size of the instance
  • Method Details

    • copy

      public Object copy()
      Produces a shallow copy of this instance. The copy has access to the same dataset. (if you want to make a copy that doesn't have access to the dataset, use new SparseInstance(instance)
      Returns:
      the shallow copy
    • copy

      public Instance copy(double[] values)
      Copies the instance but fills up its values based on the given array of doubles. The copy has access to the same dataset.
      Parameters:
      values - the array with new values
      Returns:
      the new instance
    • index

      public int index(int position)
      Returns the index of the attribute stored at the given position.
      Parameters:
      position - the position
      Returns:
      the index of the attribute stored at the given position
    • locateIndex

      public int locateIndex(int index)
      Locates the greatest index that is not greater than the given index.
      Returns:
      the internal index of the attribute index. Returns -1 if no index with this property could be found
    • mergeInstance

      public Instance mergeInstance(Instance inst)
      Merges this instance with the given instance and returns the result. Dataset is set to null.
      Parameters:
      inst - the instance to be merged with this one
      Returns:
      the merged instances
    • numAttributes

      public int numAttributes()
      Returns the number of attributes.
      Returns:
      the number of attributes as an integer
    • numValues

      public int numValues()
      Returns the number of values in the sparse vector.
      Returns:
      the number of values
    • replaceMissingValues

      public void replaceMissingValues(double[] array)
      Replaces all missing values in the instance with the values contained in the given array. A deep copy of the vector of attribute values is performed before the values are replaced.
      Parameters:
      array - containing the means and modes
      Throws:
      IllegalArgumentException - if numbers of attributes are unequal
    • setValue

      public void setValue(int attIndex, double value)
      Sets a specific value in the instance to the given value (internal floating-point format). Performs a deep copy of the vector of attribute values before the value is set.
      Parameters:
      attIndex - the attribute's index
      value - the new attribute value (If the corresponding attribute is nominal (or a string) then this is the new value's index as a double).
    • setValueSparse

      public void setValueSparse(int indexOfIndex, double value)
      Sets a specific value in the instance to the given value (internal floating-point format). Performs a deep copy of the vector of attribute values before the value is set.
      Parameters:
      indexOfIndex - the index of the attribute's index
      value - the new attribute value (If the corresponding attribute is nominal (or a string) then this is the new value's index as a double).
    • toDoubleArray

      public double[] toDoubleArray()
      Returns the values of each attribute as an array of doubles. Creates a fresh array object for this.
      Returns:
      an array containing all the instance attribute values
    • toStringNoWeight

      public String toStringNoWeight()
      Returns the description of one instance in sparse format. If the instance doesn't have access to a dataset, it returns the internal floating-point values. Quotes string values that contain whitespace characters.
      Returns:
      the instance's description as a string
    • toStringNoWeight

      public String toStringNoWeight(int afterDecimalPoint)
      Returns the description of one instance in sparse format. If the instance doesn't have access to a dataset, it returns the internal floating-point values. Quotes string values that contain whitespace characters.
      Parameters:
      afterDecimalPoint - maximum number of digits permitted after the decimal point for numeric values
      Returns:
      the instance's description as a string
    • value

      public double value(int attIndex)
      Returns an instance's attribute value in internal format.
      Parameters:
      attIndex - the attribute's index
      Returns:
      the specified value as a double (If the corresponding attribute is nominal (or a string) then it returns the value's index as a double).
    • main

      public static void main(String[] options)
      Main method for testing this class.
    • getRevision

      public String getRevision()
      Returns the revision string.
      Specified by:
      getRevision in interface RevisionHandler
      Overrides:
      getRevision in class AbstractInstance
      Returns:
      the revision