Package weka.core.converters
Class ConverterUtils.DataSource
java.lang.Object
weka.core.converters.ConverterUtils.DataSource
- All Implemented Interfaces:
Serializable
,RevisionHandler
- Enclosing class:
- ConverterUtils
public static class ConverterUtils.DataSource
extends Object
implements Serializable, RevisionHandler
Helper class for loading data from files and URLs. Via the ConverterUtils
class it determines which converter to use for loading the data into
memory. If the chosen converter is an incremental one, then the data will
be loaded incrementally, otherwise as batch. In both cases the same
interface will be used (
hasMoreElements
,
nextElement
). Before the data can be read again, one has to
call the reset
method. The data source can also be initialized
with an Instances object, in order to provide a unified interface to files
and already loaded datasets.- Version:
- $Revision: 15656 $
- Author:
- FracPete (fracpete at waikato dot ac dot nz)
- See Also:
-
Constructor Summary
ConstructorDescriptionDataSource
(InputStream stream) Initializes the datasource with the given input stream.DataSource
(String location) Tries to load the data from the file.DataSource
(Loader loader) Initializes the datasource with the given Loader.DataSource
(Instances inst) Initializes the datasource with the given dataset. -
Method Summary
Modifier and TypeMethodDescriptionreturns the full dataset, can be null in case of an error.getDataSet
(int classIndex) returns the full dataset with the specified class index set, can be null in case of an error.returns the determined loader, null if the DataSource was initialized with data alone and not a file/URL.Returns the revision string.returns the structure of the data.getStructure
(int classIndex) returns the structure of the data, with the defined class index.boolean
hasMoreElements
(Instances structure) returns whether there are more Instance objects in the data.static boolean
returns whether the extension of the location is likely to be of ARFF format, i.e., ending in ".arff" or ".arff.gz" (case-insensitive).boolean
returns whether the loader is an incremental one.static void
for testing only - takes a data file as input.nextElement
(Instances dataset) returns the next element and sets the specified dataset, null if none available.static Instances
read
(InputStream stream) convencience method for loading a dataset in batch mode from a stream.static Instances
convencience method for loading a dataset in batch mode.static Instances
convencience method for loading a dataset in batch mode.void
reset()
resets the loader.
-
Constructor Details
-
DataSource
Tries to load the data from the file. Can be either a regular file or a web location (http://, https://, ftp:// or file://).- Parameters:
location
- the name of the file to load- Throws:
Exception
- if initialization fails
-
DataSource
Initializes the datasource with the given dataset.- Parameters:
inst
- the dataset to use
-
DataSource
Initializes the datasource with the given Loader.- Parameters:
loader
- the Loader to use
-
DataSource
Initializes the datasource with the given input stream. This stream is always interpreted as ARFF.- Parameters:
stream
- the stream to use
-
-
Method Details
-
isArff
returns whether the extension of the location is likely to be of ARFF format, i.e., ending in ".arff" or ".arff.gz" (case-insensitive).- Parameters:
location
- the file location to check- Returns:
- true if the location seems to be of ARFF format
-
isIncremental
public boolean isIncremental()returns whether the loader is an incremental one.- Returns:
- true if the loader is a true incremental one
-
getLoader
returns the determined loader, null if the DataSource was initialized with data alone and not a file/URL.- Returns:
- the loader used for retrieving the data
-
getDataSet
returns the full dataset, can be null in case of an error.- Returns:
- the full dataset
- Throws:
Exception
- if resetting of loader fails
-
getDataSet
returns the full dataset with the specified class index set, can be null in case of an error.- Parameters:
classIndex
- the class index for the dataset- Returns:
- the full dataset
- Throws:
Exception
- if resetting of loader fails
-
reset
resets the loader.- Throws:
Exception
- if resetting fails
-
getStructure
returns the structure of the data.- Returns:
- the structure of the data
- Throws:
Exception
- if something goes wrong
-
getStructure
returns the structure of the data, with the defined class index.- Parameters:
classIndex
- the class index for the dataset- Returns:
- the structure of the data
- Throws:
Exception
- if something goes wrong
-
hasMoreElements
returns whether there are more Instance objects in the data.- Parameters:
structure
- the structure of the dataset- Returns:
- true if there are more Instance objects available
- See Also:
-
nextElement
returns the next element and sets the specified dataset, null if none available.- Parameters:
dataset
- the dataset to set for the instance- Returns:
- the next Instance
-
read
convencience method for loading a dataset in batch mode.- Parameters:
location
- the dataset to load- Returns:
- the dataset
- Throws:
Exception
- if loading fails
-
read
convencience method for loading a dataset in batch mode from a stream.- Parameters:
stream
- the stream to load the dataset from- Returns:
- the dataset
- Throws:
Exception
- if loading fails
-
read
convencience method for loading a dataset in batch mode.- Parameters:
loader
- the loader to get the dataset from- Returns:
- the dataset
- Throws:
Exception
- if loading fails
-
main
for testing only - takes a data file as input.- Parameters:
args
- the commandline arguments- Throws:
Exception
- if something goes wrong
-
getRevision
Returns the revision string.- Specified by:
getRevision
in interfaceRevisionHandler
- Returns:
- the revision
-