weka.core.converters
Class DatabaseLoader

java.lang.Object
  extended byweka.core.converters.AbstractLoader
      extended byweka.core.converters.DatabaseLoader
All Implemented Interfaces:
BatchConverter, DatabaseConverter, IncrementalConverter, Loader, OptionHandler, java.io.Serializable

public class DatabaseLoader
extends AbstractLoader
implements BatchConverter, IncrementalConverter, DatabaseConverter, OptionHandler

Reads from a database. Can read a database in batch or incremental mode. In inremental mode MySQL and HSQLDB are supported. For all other DBMS set a pseudoincremental mode is used: In pseudo incremental mode the instances are read into main memory all at once and then incrementally provided to the user. For incremental loading the rows in the database table have to be ordered uniquely. The reason for this is that every time only a single row is fetched by extending the user" query by a LIMIT clause. If this extension is impossible instances will be loaded pseudoincrementally. To ensure that every row is fetched exaclty once, they have to ordered. Therefore a (primary) key is necessary.This approach is chosen, instead of using JDBC driver facilities, because the latter one differ betweeen different drivers. If you use the DatabaseSaver and save instances by generating automatically a primary key (its name is defined in DtabaseUtils), this primary key will be used for ordering but will not be part of the output. The user defined SQL query to extract the instances should not contain LIMIT and ORDER BY clauses (see -Q option). In addition, for incremental loading, you can define in the DatabaseUtils file how many distinct values a nominal attribute is allowed to have. If this number is exceeded, the column will become a string attribute. In batch mode no string attributes will be created. Available options are: -Q the query to specify which tuples to load
The query must have the form: SELECT *| FROM

[WHERE} (default: SELECT * FROM Results0).

-P comma separted list of columns that are a unqiue key
Only needed for incremental loading, if it cannot be detected automatically

-I
Sets incremental loading

Version:
$Revision: 1.1 $
Author:
Stefan Mutter (mutter@cs.waikato.ac.nz)
See Also:
Loader, Serialized Form

Field Summary
static int BOOL
           
static int BYTE
           
static int DATE
           
static int DOUBLE
           
static int FLOAT
           
static int INTEGER
           
static int LONG
           
static int SHORT
           
static int STRING
           
 
Constructor Summary
DatabaseLoader()
          Constructor
 
Method Summary
 void connectToDatabase()
          Opens a connection to the database
 Instances getDataSet()
          Return the full data set in batch mode (header and all intances at once).
 java.lang.String getKeys()
          Gets the key columns' name
 Instance getNextInstance()
          Read the data set incrementally---get the next instance in the data set or returns null if there are no more instances to get.
 java.lang.String[] getOptions()
          Gets the setting
 java.lang.String getQuery()
          Gets the query to execute against the database
 Instances getStructure()
          Determines and returns (if possible) the structure (internally the header) of the data set as an empty set of instances.
 java.lang.String getUrl()
          Gets the URL
 java.lang.String getUser()
          Gets the user name
 java.lang.String globalInfo()
          Returns a string describing this Loader
 java.lang.String keysTipText()
          the tip text for this property
 java.util.Enumeration listOptions()
          Lists the available options
static void main(java.lang.String[] options)
          Main method.
 java.lang.String passwordTipText()
          the tip text for this property
 java.lang.String queryTipText()
          the tip text for this property
 void reset()
          Resets the Loader ready to read a new data set
 void resetStructure()
          Resets the structure of instances
 void setKeys(java.lang.String keys)
          Sets the key columns of a database table
 void setOptions(java.lang.String[] options)
          Sets the options.
 void setPassword(java.lang.String password)
          Sets user password for the database
 void setQuery(java.lang.String q)
          Sets the query to execute against the database
 void setSource()
          Sets the database url using the DatabaseUtils file
 void setSource(java.lang.String url)
          Sets the database url
 void setSource(java.lang.String url, java.lang.String userName, java.lang.String password)
          Sets the database url
 void setUrl(java.lang.String url)
          Sets the database URL
 void setUser(java.lang.String user)
          Sets the database user
 java.lang.String urlTipText()
          the tip text for this property
 java.lang.String userTipText()
          the tip text for this property
 
Methods inherited from class weka.core.converters.AbstractLoader
setSource, setSource
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

STRING

public static final int STRING
See Also:
Constant Field Values

BOOL

public static final int BOOL
See Also:
Constant Field Values

DOUBLE

public static final int DOUBLE
See Also:
Constant Field Values

BYTE

public static final int BYTE
See Also:
Constant Field Values

SHORT

public static final int SHORT
See Also:
Constant Field Values

INTEGER

public static final int INTEGER
See Also:
Constant Field Values

LONG

public static final int LONG
See Also:
Constant Field Values

FLOAT

public static final int FLOAT
See Also:
Constant Field Values

DATE

public static final int DATE
See Also:
Constant Field Values
Constructor Detail

DatabaseLoader

public DatabaseLoader()
               throws java.lang.Exception
Constructor

Method Detail

globalInfo

public java.lang.String globalInfo()
Returns a string describing this Loader

Returns:
a description of the Loader suitable for displaying in the explorer/experimenter gui

reset

public void reset()
           throws java.lang.Exception
Resets the Loader ready to read a new data set

Throws:
java.lang.Exception - if an error occurs while disconnecting from the database

resetStructure

public void resetStructure()
Resets the structure of instances


setQuery

public void setQuery(java.lang.String q)
Sets the query to execute against the database

Parameters:
q - the query to execute

getQuery

public java.lang.String getQuery()
Gets the query to execute against the database

Returns:
the query

queryTipText

public java.lang.String queryTipText()
the tip text for this property

Returns:
the tip text

setKeys

public void setKeys(java.lang.String keys)
Sets the key columns of a database table

Parameters:
keys - a String containing the key columns in a comma separated list.

getKeys

public java.lang.String getKeys()
Gets the key columns' name

Returns:
name of the key columns'

keysTipText

public java.lang.String keysTipText()
the tip text for this property

Returns:
the tip text

setUrl

public void setUrl(java.lang.String url)
Sets the database URL

Specified by:
setUrl in interface DatabaseConverter

getUrl

public java.lang.String getUrl()
Gets the URL

Specified by:
getUrl in interface DatabaseConverter
Returns:
the URL

urlTipText

public java.lang.String urlTipText()
the tip text for this property

Returns:
the tip text

setUser

public void setUser(java.lang.String user)
Sets the database user

Specified by:
setUser in interface DatabaseConverter

getUser

public java.lang.String getUser()
Gets the user name

Specified by:
getUser in interface DatabaseConverter
Returns:
name of database user

userTipText

public java.lang.String userTipText()
the tip text for this property

Returns:
the tip text

setPassword

public void setPassword(java.lang.String password)
Sets user password for the database

Specified by:
setPassword in interface DatabaseConverter

passwordTipText

public java.lang.String passwordTipText()
the tip text for this property

Returns:
the tip text

setSource

public void setSource(java.lang.String url,
                      java.lang.String userName,
                      java.lang.String password)
Sets the database url

Parameters:
url - the database url
userName - the user name
password - the password

setSource

public void setSource(java.lang.String url)
Sets the database url

Parameters:
url - the database url

setSource

public void setSource()
               throws java.lang.Exception
Sets the database url using the DatabaseUtils file

Throws:
java.lang.Exception

connectToDatabase

public void connectToDatabase()
Opens a connection to the database


getStructure

public Instances getStructure()
                       throws java.io.IOException
Determines and returns (if possible) the structure (internally the header) of the data set as an empty set of instances.

Specified by:
getStructure in interface Loader
Specified by:
getStructure in class AbstractLoader
Returns:
the structure of the data set as an empty set of Instances
Throws:
java.io.IOException - if an error occurs

getDataSet

public Instances getDataSet()
                     throws java.io.IOException
Return the full data set in batch mode (header and all intances at once).

Specified by:
getDataSet in interface Loader
Specified by:
getDataSet in class AbstractLoader
Returns:
the structure of the data set as an empty set of Instances
Throws:
java.io.IOException - if there is no source or parsing fails

getNextInstance

public Instance getNextInstance()
                         throws java.io.IOException
Read the data set incrementally---get the next instance in the data set or returns null if there are no more instances to get. If the structure hasn't yet been determined by a call to getStructure then method does so before returning the next instance in the data set.

Specified by:
getNextInstance in interface Loader
Specified by:
getNextInstance in class AbstractLoader
Returns:
the next instance in the data set as an Instance object or null if there are no more instances to be read
Throws:
java.io.IOException - if there is an error during parsing

getOptions

public java.lang.String[] getOptions()
Gets the setting

Specified by:
getOptions in interface OptionHandler
Returns:
the current setting

listOptions

public java.util.Enumeration listOptions()
Lists the available options

Specified by:
listOptions in interface OptionHandler
Returns:
an enumeration of the available options

setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception
Sets the options. Available options are: -Q the query to specify which tuples to load
The query must have the form: SELECT *| FROM [WHERE} (default: SELECT * FROM Results0).

-P comma separted list of columns that are a unqiue key
Only needed for incremental loading, if it cannot be detected automatically

-I
Sets incremental loading

Specified by:
setOptions in interface OptionHandler
Parameters:
options - the options
Throws:
java.lang.Exception - if options cannot be set

main

public static void main(java.lang.String[] options)
Main method.

Parameters:
options - the options