Types and Functions

Index

TSML.DecisionTreeLearners.Adaboost
TSML.DecisionTreeLearners.PrunedTree
TSML.DecisionTreeLearners.RandomForest
TSML.Outliernicers.Outliernicer
TSML.Plotters.Plotter
TSML.BaselineAlgos.Baseline
TSML.BaselineAlgos.Identity
TSML.EnsembleMethods.BestLearner
TSML.EnsembleMethods.StackEnsemble
TSML.EnsembleMethods.VoteEnsemble
TSML.MLBaseWrapper.StandardScaler
TSML.MLBaseWrapper.Standardize
TSML.Monotonicers.Monotonicer
TSML.Statifiers.Statifier
TSML.BaseFilters.Imputer
TSML.BaseFilters.OneHotEncoder
TSML.BaseFilters.Wrapper
TSML.BaseFilters.createtransformer
TSML.BaseFilters.find_nominal_columns
TSML.ValDateFilters.BzCSVDateValReader
TSML.ValDateFilters.CSVDateValReader
TSML.ValDateFilters.CSVDateValWriter
TSML.ValDateFilters.DateValLinearImputer
TSML.ValDateFilters.DateValMultiNNer
TSML.ValDateFilters.DateValNNer
TSML.ValDateFilters.DateValgator
TSML.ValDateFilters.DateValizer
TSML.ValDateFilters.Dateifier
TSML.ValDateFilters.Matrifier
TSML.TSClassifiers.TSClassifier
TSML.TSMLTypes.fit!
TSML.TSMLTypes.fit!
TSML.TSMLTypes.fit!
TSML.TSMLTypes.fit!
TSML.TSMLTypes.fit!
TSML.TSMLTypes.fit!
TSML.TSMLTypes.fit!
TSML.TSMLTypes.fit!
TSML.TSMLTypes.fit!
TSML.TSMLTypes.fit!
TSML.TSMLTypes.fit!
TSML.TSMLTypes.fit!
TSML.TSMLTypes.fit!
TSML.TSMLTypes.fit!
TSML.TSMLTypes.fit!
TSML.TSMLTypes.fit!
TSML.TSMLTypes.fit!
TSML.TSMLTypes.fit!
TSML.TSMLTypes.fit!
TSML.TSMLTypes.fit!
TSML.TSMLTypes.fit!
TSML.TSMLTypes.fit!
TSML.TSMLTypes.fit!
TSML.TSMLTypes.fit!
TSML.TSMLTypes.fit!
TSML.TSMLTypes.transform!
TSML.TSMLTypes.transform!
TSML.TSMLTypes.transform!
TSML.TSMLTypes.transform!
TSML.TSMLTypes.transform!
TSML.TSMLTypes.transform!
TSML.TSMLTypes.transform!
TSML.TSMLTypes.transform!
TSML.TSMLTypes.transform!
TSML.TSMLTypes.transform!
TSML.TSMLTypes.transform!
TSML.TSMLTypes.transform!
TSML.TSMLTypes.transform!
TSML.TSMLTypes.transform!
TSML.TSMLTypes.transform!
TSML.TSMLTypes.transform!
TSML.TSMLTypes.transform!
TSML.TSMLTypes.transform!
TSML.TSMLTypes.transform!
TSML.TSMLTypes.transform!
TSML.TSMLTypes.transform!
TSML.TSMLTypes.transform!
TSML.TSMLTypes.transform!
TSML.TSMLTypes.transform!
TSML.TSMLTypes.transform!

Descriptions

TSML.DecisionTreeLearners.Adaboost — Type

Adaboost(
  Dict(
    :output => :class,
    :num_iterations => 7
  )
)

Adaboosted decision tree stumps. See DecisionTree.jl's documentation

Hyperparameters:

:num_iterations => 7 (number of iterations of AdaBoost)

Implements fit!, transform!

source

TSML.DecisionTreeLearners.PrunedTree — Type

PrunedTree(
  Dict(
    :purity_threshold => 1.0,
    :max_depth => -1,
    :min_samples_leaf => 1,
    :min_samples_split => 2,
    :min_purity_increase => 0.0
  )
)

Decision tree classifier. See DecisionTree.jl's documentation

Hyperparmeters:

:purity_threshold => 1.0 (merge leaves having >=thresh combined purity)
:max_depth => -1 (maximum depth of the decision tree)
:min_samples_leaf => 1 (the minimum number of samples each leaf needs to have)
:min_samples_split => 2 (the minimum number of samples in needed for a split)
:min_purity_increase => 0.0 (minimum purity needed for a split)

Implements fit!, transform!

source

TSML.DecisionTreeLearners.RandomForest — Type

RandomForest(
  Dict(
    :output => :class,
    :num_subfeatures => 0,
    :num_trees => 10,
    :partial_sampling => 0.7,
    :max_depth => -1
  )
)

Random forest classification. See DecisionTree.jl's documentation

Hyperparmeters:

:num_subfeatures => 0 (number of features to consider at random per split)
:num_trees => 10 (number of trees to train)
:partial_sampling => 0.7 (fraction of samples to train each tree on)
:max_depth => -1 (maximum depth of the decision trees)
:min_samples_leaf => 1 (the minimum number of samples each leaf needs to have)
:min_samples_split => 2 (the minimum number of samples in needed for a split)
:min_purity_increase => 0.0 (minimum purity needed for a split)

Implements fit!, transform!

source

TSML.TSMLTypes.fit! — Method

fit!(adaboost::Adaboost, features::T, labels::Vector) where {T<:Union{Vector,Matrix,DataFrame}}

Optimize the hyperparameters of Adaboost instance.

source

TSML.TSMLTypes.fit! — Method

fit!(tree::PrunedTree, features::T, labels::Vector) where {T<:Union{Vector,Matrix,DataFrame}}

Optimize the hyperparameters of PrunedTree instance.

source

TSML.TSMLTypes.fit! — Method

fit!(forest::RandomForest, features::T, labels::Vector) where {T<:Union{Vector,Matrix,DataFrame}}

Optimize the parameters of the RandomForest instance.

source

TSML.TSMLTypes.transform! — Method

transform!(adaboost::Adaboost, features::T) where {T<:Union{Vector,Matrix,DataFrame}}

Predict using the optimized hyperparameters of the trained Adaboost instance.

source

TSML.TSMLTypes.transform! — Method

transform!(tree::PrundTree, features::T) where {T<:Union{Vector,Matrix,DataFrame}}

Predict using the optimized hyperparameters of the trained PrunedTree instance.

source

TSML.TSMLTypes.transform! — Method

transform!(forest::RandomForest, features::T) where {T<:Union{Vector,Matrix,DataFrame}}

Predict using the optimized hyperparameters of the trained RandomForest instance.

source

TSML.Outliernicers.Outliernicer — Type

Outliernicer(Dict(
   :dateinterval => Dates.Hour(1),
   :nnsize => 1,
   :missdirection => :symmetric
))

Detects outliers below or above (q25-iqr,q75+iqr) and calls DateValNNer to replace them with nearest neighbors.

Example:

fname = joinpath(dirname(pathof(TSML)),"../data/testdata.csv")
csvfilter = CSVDateValReader(Dict(:filename=>fname,:dateformat=>"dd/mm/yyyy HH:MM"))
valgator = DateValgator(Dict(:dateinterval=>Dates.Hour(1)))
valnner = DateValNNer(Dict(:dateinterval=>Dates.Hour(1)))
stfier = Statifier(Dict(:processmissing=>true))
mono = Monotonicer(Dict())
outliernicer = Outliernicer(Dict(:dateinterval=>Dates.Hour(1)))

mpipeline = Pipeline(Dict(
     :transformers => [csvfilter,valgator,mono,valnner,outliernicer,stfier]
   )
)
fit!(mpipeline)
results = transform!(mpipeline)

Implements: fit!, transform!

source

TSML.TSMLTypes.fit! — Function

fit!(st::Outliernicer, features::T, labels::Vector=[]) where {T<:Union{Vector,Matrix,DataFrame}}

Check that features are two-colum data.

source

TSML.TSMLTypes.transform! — Method

transform!(st::Outliernicer, features::T) where {T<:Union{Vector,Matrix,DataFrame}}

Locate outliers based on IQR factor and calls DateValNNer to replace them with nearest neighbors.

source

TSML.Plotters.Plotter — Type

Plotter( Dict( :interactive => false, :pdfoutput => false ) )

Plots a TS by default but performs interactive plotting if specified during instance creation.

:interactive => boolean to indicate whether to use interactive plotting with false as default
:pdfoutput => boolean to indicate whether ouput will be saved as pdf with false as default

Example:

csvfilter = CSVDateValReader(Dict(:filename=>fname,:dateformat=>"dd/mm/yyyy HH:MM")) pltr = Plotter(Dict(:interactive => false))

mpipeline = Pipeline(Dict( :transformers => [csvfilter,pltr] ) ) fit!(mpipeline) myplot = transform!(mpipeline)

Implements: fit!, transform!

source

TSML.TSMLTypes.fit! — Function

fit!(pltr::Plotter, features::T, labels::Vector=[]) where {T<:Union{Vector,Matrix,DataFrame}}

Check validity of features: 2-column Date,Val data

source

TSML.TSMLTypes.transform! — Method

transform!(pltr::Plotter, features::T) where {T<:Union{Vector,Matrix,DataFrame}}

Convert missing into NaN to allow plotting of discontinuities.

source

TSML.BaselineAlgos.Baseline — Type

Baseline(
   default_args = Dict(
      :output => :class,
      :strat => mode
   )
)

Baseline model that returns the mode during classification.

source

TSML.BaselineAlgos.Identity — Type

Identity(args=Dict())

Returns the input as output.

source

TSML.TSMLTypes.fit! — Function

fit!(bsl::Baseline,x::Matrix,y::Vector)

Get the mode of the training data.

source

TSML.TSMLTypes.fit! — Function

fit!(idy::Identity,x::Matrix,y::Vector)

Does nothing.

source

TSML.TSMLTypes.transform! — Method

transform!(bsl::Baseline,x::Matrix)

Return the mode in classification.

source

TSML.TSMLTypes.transform! — Method

transform!(idy::Identity,x::Matrix)

Return the input as output.

source

TSML.EnsembleMethods.BestLearner — Type

BestLearner(
   Dict(
      # Output to train against
      # (:class).
      :output => :class,
      # Function to return partitions of instance indices.
      :partition_generator => (instances, labels) -> kfold(size(instances, 1), 5),
      # Function that selects the best learner by index.
      # Arg learner_partition_scores is a (learner, partition) score matrix.
      :selection_function => (learner_partition_scores) -> findmax(mean(learner_partition_scores, dims=2))[2],      
      # Score type returned by score() using respective output.
      :score_type => Real,
      # Candidate learners.
      :learners => [PrunedTree(), Adaboost(), RandomForest()],
      # Options grid for learners, to search through by BestLearner.
      # Format is [learner_1_options, learner_2_options, ...]
      # where learner_options is same as a learner's options but
      # with a list of values instead of scalar.
      :learner_options_grid => nothing
   )
)

Selects best learner from the set by performing a grid search on learners if grid option is indicated.

source

TSML.EnsembleMethods.StackEnsemble — Type

StackEnsemble(
   Dict(    
      # Output to train against
      # (:class).
      :output => :class,
      # Set of learners that produce feature space for stacker.
      :learners => [PrunedTree(), Adaboost(), RandomForest()],
      # Machine learner that trains on set of learners' outputs.
      :stacker => RandomForest(),
      # Proportion of training set left to train stacker itself.
      :stacker_training_proportion => 0.3,
      # Provide original features on top of learner outputs to stacker.
      :keep_original_features => false
   )
)

An ensemble where a 'stack' of learners is used for training and prediction.

source

TSML.EnsembleMethods.VoteEnsemble — Type

VoteEnsemble(
   Dict( 
      # Output to train against
      # (:class).
      :output => :class,
      # Learners in voting committee.
      :learners => [PrunedTree(), Adaboost(), RandomForest()]
   )
)

Set of machine learners employing majority vote to decide prediction.

Implements: fit!, transform!

source

TSML.TSMLTypes.fit! — Method

fit!(bls::BestLearner, instances::T, labels::Vector) where {T<:Union{Matrix,DataFrame}}

Training phase:

obtain learners as is if grid option is not present
generate learners if grid option is present
foreach prototype learner, generate learners with specific options found in grid
generate partitions
train each learner on each partition and obtain validation output

source

TSML.TSMLTypes.fit! — Method

fit!(se::StackEnsemble, instances::T, labels::Vector) where {T<:Union{Vector,Matrix,DataFrame}}

Training phase of the stack of learners.

perform holdout to obtain indices for
partition learner and stacker training sets
partition training set for learners and stacker
train all learners
train stacker on learners' outputs
build final model from the trained learners

source

TSML.TSMLTypes.fit! — Method

fit!(ve::VoteEnsemble, instances::T, labels::Vector) where {T<:Union{Vector,Matrix,DataFrame}}

Training phase of the ensemble.

source

TSML.TSMLTypes.transform! — Method

transform!(bls::BestLearner, instances::T) where {T<:Union{Vector,Matrix,DataFrame}}

Choose the best learner based on cross-validation results and use it for prediction.

source

TSML.TSMLTypes.transform! — Method

transform!(se::StackEnsemble, instances::T) where {T<:Union{Vector,Matrix,DataFrame}}

Build stacker instances and predict

source

TSML.TSMLTypes.transform! — Method

transform!(ve::VoteEnsemble, instances::T) where {T<:Union{Vector,Matrix,DataFrame}}

Prediction phase of the ensemble.

source

TSML.MLBaseWrapper.StandardScaler — Type

StandardScaler(
   Dict( 
      :center => true,
      :scale => true
   )
)

Standardizes each feature using (X - mean) / stddev. Will produce NaN if standard deviation is zero.

source

TSML.MLBaseWrapper.Standardize — Type

Standardize(d::Int, m::Vector{Float64}, s::Vector{Float64})

Standardization type.

source

TSML.TSMLTypes.fit! — Function

fit!(st::StandardScaler, features::T, labels::Vector=[]) where {T<:Union{Vector,Matrix,DataFrame}}

Compute the parameters to center and scale.

source

TSML.TSMLTypes.transform! — Method

transform!(st::StandardScaler, features::T)  where {T<:Union{Vector,Matrix,DataFrame}}

Apply the computed parameters for centering and scaling to new data.

source

TSML.Monotonicers.Monotonicer — Type

Monotonicer()

Monotonic filter to detect and normalize two types of dataset:

daily monotonic
entirely non-decreasing/non-increasing data

Example:

fname = joinpath(dirname(pathof(TSML)),"../data/testdata.csv")
csvfilter = CSVDateValReader(Dict(:filename=>fname,:dateformat=>"dd/mm/yyyy HH:MM"))
valgator = DateValgator(Dict(:dateinterval=>Dates.Hour(1)))
valnner = DateValNNer(Dict(:dateinterval=>Dates.Hour(1)))
stfier = Statifier(Dict(:processmissing=>true))
mono = Monotonicer(Dict())

mypipeline = Pipeline(Dict(
    :transformers => [csvfilter,valgator,mono,stfier]
   )
)
fit!(mypipeline)
result = transform!(mypipeline)

Implements: fit!, transform!

source

TSML.TSMLTypes.fit! — Function

fit!(st::Monotonicer,features::T, labels::Vector=[]) where {T<:Union{Vector,Matrix,DataFrame}}

A function that checks if features are two-column data of Dates and Values

source

TSML.TSMLTypes.transform! — Method

transform!(st::Monotonicer, features::T) where {T<:Union{Vector,Matrix,DataFrame}}

Normalize monotonic or daily monotonic data by taking the diffs and counting the flips.

source

TSML.Statifiers.Statifier — Type

Statifier(Dict(
   :processmissing => true
))

Outputs summary statistics such as mean, median, quartile, entropy, kurtosis, skewness, etc. with parameter:

:processmissing => boolean to indicate whether to include missing data stats.

Example:

dt=[missing;rand(1:10,3);missing;missing;missing;rand(1:5,3)]
dat = DataFrame(Date= DateTime(2017,12,31,1):Dates.Hour(1):DateTime(2017,12,31,10) |> collect,
                Value = dt)

statfier = Statifier(Dict(:processmissing=>false))

fit!(statfier,dat)
results=transform!(statfier,dat)

Implements: fit!, transform!

source

TSML.TSMLTypes.fit! — Function

fit!(st::Statifier, features::T=[], labels::Vector=[]) where {T<:Union{Vector,Matrix,DataFrame}}

Validate argument to make sure it's a 2-column format.

source

TSML.TSMLTypes.transform! — Function

transform!(st::Statifier, features::T=[]) where {T<:Union{Vector,Matrix,DataFrame}}

Compute statistics.

source

TSML.BaseFilters.Imputer — Type

Imputer(
   Dict(
      # Imputation strategy.
      # Statistic that takes a vector such as mean or median.
      :strategy => mean
   )
)

Imputes NaN values from Float64 features.

source

TSML.BaseFilters.OneHotEncoder — Type

OneHotEncoder(Dict(
   # Nominal columns
   :nominal_columns => nothing,

   # Nominal column values map. Key is column index, value is list of
   # possible values for that column.
   :nominal_column_values_map => nothing
))

Transforms instances with nominal features into one-hot form and coerces the instance matrix to be of element type Float64.

source

TSML.BaseFilters.Wrapper — Type

Wrapper(
   default_args = Dict(
      # Transformer to call.
      :transformer => OneHotEncoder(),
      # Transformer args.
      :transformer_args => nothing
   )
)

Wraps around a TSML transformer.

source

TSML.BaseFilters.createtransformer — Function

createtransformer(prototype::Transformer, args=nothing)

Create transformer

prototype: prototype transformer to base new transformer on
options: additional options to override prototype's options

Returns: new transformer.

source

TSML.BaseFilters.find_nominal_columns — Method

find_nominal_columns(features::T) where {T<:Union{Vector,Matrix,DataFrame}}

Finds all nominal columns.

Nominal columns are those that do not have Real type nor do all their elements correspond to Real.

source

TSML.ValDateFilters.BzCSVDateValReader — Type

BzCSVDateValReader(
   Dict(
      :filename => "",
      :dateformat => ""
   )
)

Reads Bzipped csv file and parse date using the given format.

:filename => complete path including filename of csv file
:dateformat => date format to parse

Example:

inputfile =joinpath(dirname(pathof(TSML)),"../data/testdata.csv.bz2")
csvreader = BzCSVDateValReader(Dict(:filename=>inputfile,:dateformat=>"d/m/y H:M"))
filter1 = DateValgator()
filter2 = DateValNNer(Dict(:nnsize=>1))
mypipeline = Pipeline(Dict(
      :transformers => [csvreader,filter1,filter2]
  )
)
fit!(mypipeline)
res=transform!(mypipeline)

Implements: fit!, transform!

source

TSML.ValDateFilters.CSVDateValReader — Type

CSVDateValReader(
   Dict(
      :filename => "",
      :dateformat => ""
   )
)

Reads csv file and parse date using the given format.

:filename => complete path including filename of csv file
:dateformat => date format to parse

Example:

inputfile =joinpath(dirname(pathof(TSML)),"../data/testdata.csv")
csvreader = CSVDateValReader(Dict(:filename=>inputfile,:dateformat=>"d/m/y H:M"))
fit!(csvreader)
df = transform!(csvreader)

# using pipeline workflow
filter1 = DateValgator()
filter2 = DateValNNer(Dict(:nnsize=>1))
mypipeline = Pipeline(Dict(
      :transformers => [csvreader,filter1,filter2]
  )
)
fit!(mypipeline)
res=transform!(mypipeline)

Implements: fit!, transform!

source

TSML.ValDateFilters.CSVDateValWriter — Type

CSVDateValWriter(
   Dict(
      :filename => "",
      :dateformat => ""
   )
)

Writes the time series dataframe into a file with the given date format.

Example:

inputfile =joinpath(dirname(pathof(TSML)),"../data/testdata.csv")
outputfile = joinpath("/tmp/test.csv")
csvreader = CSVDateValReader(Dict(:filename=>inputfile,:dateformat=>"d/m/y H:M"))
csvwtr = CSVDateValWriter(Dict(:filename=>outputfile,:dateformat=>"d/m/y H:M"))
filter1 = DateValgator()
filter2 = DateValNNer(Dict(:nnsize=>1))
mypipeline = Pipeline(Dict(
      :transformers => [csvreader,filter1,filter2,csvwtr]
  )
)
fit!(mypipeline)
res=transform!(mypipeline)

# read back what was written to validate
csvreader = CSVDateValReader(Dict(:filename=>outputfile,:dateformat=>"y-m-d HH:MM:SS"))
fit!(csvreader)
transform!(csvreader)

Implements: fit!, transform!

source

TSML.ValDateFilters.DateValLinearImputer — Type

DateValLinearImputer(
   Dict(
      :dateinterval => Dates.Hour(1),
  )
)

Fills missings by linear interpolation.

:dateinterval => time period to use for grouping,

Example:

Random.seed!(123)
gdate = DateTime(2014,1,1):Dates.Minute(15):DateTime(2016,1,1)
gval = Array{Union{Missing,Float64}}(rand(length(gdate)))
gmissing = 50000
gndxmissing = Random.shuffle(1:length(gdate))[1:gmissing]
X = DataFrame(Date=gdate,Value=gval)
X.Value[gndxmissing] .= missing

dnnr = DateValLinearImputer()
fit!(dnnr,X)
transform!(dnnr,X)

Implements: fit!, transform!`

source

TSML.ValDateFilters.DateValMultiNNer — Type

DateValMultiNNer(
   Dict(
      :type => :knn # :linear
      :missdirection => :symmetric, #:reverse, # or :forward or :symmetric
      :dateinterval => Dates.Hour(1),
      :nnsize => 1,
      :strict => true,
      :aggregator => :median
  )
)

Fills missings with their nearest-neighbors. It assumes that first column is a Date class and the other columns are Union{Missings,Real}. It uses DateValNNer and DateValizer+Impute to process each numeric column concatendate with the Date column.

:type => type of imputation which can be a linear interpolation or nearest neighbor
:missdirection => direction to fill missing data (:symmetric, :reverse, :forward)
:dateinterval => time period to use for grouping,
:nnsize => neighborhood size,
:strict => boolean value to indicate whether to be strict about replacement or not,
`:aggregator => function to aggregate based on date interval

Example:

Random.seed!(123)
gdate = DateTime(2014,1,1):Dates.Minute(15):DateTime(2016,1,1)
gval1 = Array{Union{Missing,Float64}}(rand(length(gdate)))
gval2 = Array{Union{Missing,Float64}}(rand(length(gdate)))
gval3 = Array{Union{Missing,Float64}}(rand(length(gdate)))
gmissing = 50000
gndxmissing1 = Random.shuffle(1:length(gdate))[1:gmissing]
gndxmissing2 = Random.shuffle(1:length(gdate))[1:gmissing]
gndxmissing3 = Random.shuffle(1:length(gdate))[1:gmissing]
X = DataFrame(Date=gdate,Temperature=gval1,Humidity=gval2,Ozone=gval3)
X.Temperature[gndxmissing1] .= missing
X.Humidity[gndxmissing2] .= missing
X.Ozone[gndxmissing3] .= missing

dnnr = DateValMultiNNer(Dict(
      :type=>:linear,
      :dateinterval=>Dates.Hour(1),
      :nnsize=>10,
      :missdirection => :symmetric,
      :strict=>true,
      :aggregator => :mean))
fit!(dnnr,X)
transform!(dnnr,X)

Implements: fit!, transform!`

source

TSML.ValDateFilters.DateValNNer — Type

DateValNNer(
   Dict(
      :missdirection => :symmetric, #:reverse, # or :forward or :symmetric
      :dateinterval => Dates.Hour(1),
      :nnsize => 1,
      :strict => true,
      :aggregator => :median
  )
)

Fills missings with their nearest-neighbors.

:missdirection => direction to fill missing data (:symmetric, :reverse, :forward)
:dateinterval => time period to use for grouping,
:nnsize => neighborhood size,
:strict => boolean value to indicate whether to be strict about replacement or not,
`:aggregator => function to aggregate based on date interval

Example:

Random.seed!(123)
gdate = DateTime(2014,1,1):Dates.Minute(15):DateTime(2016,1,1)
gval = Array{Union{Missing,Float64}}(rand(length(gdate)))
gmissing = 50000
gndxmissing = Random.shuffle(1:length(gdate))[1:gmissing]
X = DataFrame(Date=gdate,Value=gval)
X.Value[gndxmissing] .= missing

dnnr = DateValNNer(Dict(
      :dateinterval=>Dates.Hour(1),
      :nnsize=>10,
      :missdirection => :symmetric,
      :strict=>true,
      :aggregator => :mean))
fit!(dnnr,X)
transform!(dnnr,X)

Implements: fit!, transform!`

source

TSML.ValDateFilters.DateValgator — Type

DateValgator(args=Dict())
   Dict(
    :dateinterval => Dates.Hour(1),
    :aggregator => :median
  )
)

Aggregates values based on date period specified.

Example:

# generate random values with missing data
Random.seed!(123)
gdate = DateTime(2014,1,1):Dates.Minute(15):DateTime(2016,1,1)
gval = Array{Union{Missing,Float64}}(rand(length(gdate)))
gmissing = 50000
gndxmissing = Random.shuffle(1:length(gdate))[1:gmissing]
X = DataFrame(Date=gdate,Value=gval)
X.Value[gndxmissing] .= missing

dtvlmean = DateValgator(Dict(
      :dateinterval=>Dates.Hour(1),
      :aggregator => :mean))
fit!(dtvlmean,X)
res = transform!(dtvlmean,X)

Implements: fit!, transform!

source

TSML.ValDateFilters.DateValizer — Type

DateValizer(
   Dict(
    :medians => DataFrame(),
    :dateinterval => Dates.Hour(1)
  )
)

Normalizes and cleans time series by replacing missings with global medians computed based on time period groupings.

Example:

# generate random values with missing data
Random.seed!(123)
gdate = DateTime(2014,1,1):Dates.Minute(15):DateTime(2016,1,1)
gval = Array{Union{Missing,Float64}}(rand(length(gdate)))
gmissing = 50000
gndxmissing = Random.shuffle(1:length(gdate))[1:gmissing]
X = DataFrame(Date=gdate,Value=gval)
X.Value[gndxmissing] .= missing

dvzr = DateValizer(Dict(:dateinterval=>Dates.Hour(1)))
fit!(dvzr,X)
transform!(dvzr,X)

Implements: fit!, transform!

source

TSML.ValDateFilters.Dateifier — Type

Dateifier(args=Dict())
   Dict(
    :ahead => 1,
    :size => 7,
    :stride => 1
   )
)

Converts a 1-D date series into sliding window matrix for ML training

Example:

dtr = Dateifier(Dict())
lower = DateTime(2017,1,1)
upper = DateTime(2018,1,31)
dat=lower:Dates.Day(1):upper |> collect
vals = rand(length(dat))
x=DataFrame(Date=dat,Value=vals)
fit!(dtr,x)
res = transform!(dtr,x)

Implements: 'fit!, transform!

source

TSML.ValDateFilters.Matrifier — Type

Matrifier(Dict(
   Dict(
    :ahead => 1,
    :size => 7,
    :stride => 1,
  )
)

Converts a 1-D timeseries into sliding window matrix for ML training:

:ahead => steps ahead to predict
:size => size of sliding window
:stride => amount of overlap in sliding window

Example:

mtr = Matrifier(Dict(:ahead=>24,:size=>24,:stride=>5))
lower = DateTime(2017,1,1)
upper = DateTime(2017,1,5)
dat=lower:Dates.Hour(1):upper |> collect
vals = 1:length(dat)
x = DataFrame(Date=dat,Value=vals)
fit!(mtr,x)
res = transform!(mtr,x)

Implements: fit!, transform

source

TSML.TSMLTypes.fit! — Function

fit!(csvwtr::CSVDateValWriter,x::T=[],y::Vector=[]) where {T<:Union{DataFrame,Vector,Matrix}}

Makes sure filename and dateformat are not empty strings.

source

TSML.TSMLTypes.fit! — Function

fit!(dtr::Dateifier,xx::T,y::Vector=[]) where {T<:Union{Matrix,Vector,DataFrame}}

Computes range of dates to be used during transform.

source

TSML.TSMLTypes.fit! — Function

fit!(dvzr::DateValizer,xx::T,y::Vector=[]) where {T<:DataFrame}

Validates input and computes global medians grouped by time period.

source

TSML.TSMLTypes.fit! — Function

fit!(csvrdr::CSVDateValReader,x::T=[],y::Vector=[]) where {T<:Union{DataFrame,Vector,Matrix}}

Makes sure filename and dateformat are not empty strings.

source

TSML.TSMLTypes.fit! — Function

fit!(dnnr::DateValNNer,xx::T,y::Vector=[]) where {T<:DataFrame}

Validates and checks arguments for errors.

source

TSML.TSMLTypes.fit! — Function

fit!(dvmr::DateValgator,xx::T,y::Vector=[]) where {T<:Union{Matrix,DataFrame}}

Checks and validates arguments.

source

TSML.TSMLTypes.fit! — Function

fit!(mtr::Matrifier,xx::T,y::Vector=Vector()) where {T<:Union{Matrix,Vector,DataFrame}}

Checks and validate inputs are in correct structure

source

TSML.TSMLTypes.fit! — Function

fit!(dnnr::DateValMultiNNer,xx::T,y::Vector=[]) where {T<:DataFrame}

Validates and checks arguments for errors.

source

TSML.TSMLTypes.fit! — Function

fit!(bzcsvrdr::BzCSVDateValReader,x::T=[],y::Vector=[]) where {T<:Union{DataFrame,Vector,Matrix}}

Makes sure filename and dateformat are not empty strings.

source

TSML.TSMLTypes.fit! — Function

fit!(dnnr::DateValLinearImputer,xx::T,y::Vector=[]) where {T<:DataFrame}

Validates and checks arguments for errors.

source

TSML.TSMLTypes.transform! — Function

transform!(csvrdr::CSVDateValReader,x::T=[]) where {T<:Union{DataFrame,Vector,Matrix}}

Uses CSV package to read the csv file and converts it to dataframe.

source

TSML.TSMLTypes.transform! — Function

transform!(bzcsvrdr::BzCSVDateValReader,x::T=[]) where {T<:Union{DataFrame,Vector,Matrix}}

Uses CodecBzip2 package to read the csv file and converts it to dataframe.

source

TSML.TSMLTypes.transform! — Method

transform!(csvwtr::CSVDateValWriter,x::T) where {T<:Union{DataFrame,Vector,Matrix}}

Uses CSV package to write the dataframe into a csv file.

source

TSML.TSMLTypes.transform! — Method

transform!(dnnr::DateValLinearImputer,xx::T) where {T<:DataFrame}

Replaces missings by linear interpolation.

source

TSML.TSMLTypes.transform! — Method

transform!(dnnr::DateValMultiNNer,xx::T) where {T<:DataFrame}

Replaces missings by nearest neighbor or linear interpolation by looping over the dataset for each column until all missing values are gone.

source

TSML.TSMLTypes.transform! — Method

transform!(dnnr::DateValNNer,xx::T) where {T<:DataFrame}

Replaces missings by nearest neighbor looping over the dataset until all missing values are gone.

source

TSML.TSMLTypes.transform! — Method

transform!(dvmr::DateValgator,xx::T) where {T<:DataFrame}

Aggregates values grouped by date-time period using aggregate function such as mean, median, maximum, minimum. Default is mean.

source

TSML.TSMLTypes.transform! — Method

transform!(dvzr::DateValizer,xx::T) where {T<:DataFrame}

Replaces missing with the corresponding global medians with respect to time period.

source

TSML.TSMLTypes.transform! — Method

transform!(dtr::Dateifier,xx::T) where {T<:Union{Matrix,Vector,DataFrame}}

Transforms to day of the month, day of the week, etc

source

TSML.TSMLTypes.transform! — Method

transform!(mtr::Matrifier,xx::T) where {T<:Union{Matrix,Vector,DataFrame}}

Applies the parameters of sliding windows to create the corresponding matrix

source

TSML.TSClassifiers.TSClassifier — Type

TSClassifier(
   Dict(
      # training directory
      :trdirectory => "",
      :tstdirectory => "",
      :modeldirectory => "",
      :feature_range => 7:20,
      :juliarfmodelname => "juliarfmodel.serialized",
      # Output to train against
      # (:class).
      :output => :class,
      # Options specific to this implementation.
      :impl_args => Dict(
         # Merge leaves having >= purity_threshold CombineMLd purity.
         :purity_threshold => 1.0,
         # Maximum depth of the decision tree (default: no maximum).
         :max_depth => -1,
         # Minimum number of samples each leaf needs to have.
         :min_samples_leaf => 1,
         # Minimum number of samples in needed for a split.
         :min_samples_split => 2,
         # Minimum purity needed for a split.
         :min_purity_increase => 0.0
      )
   )
)

Given a bunch of time-series with specific types. Get the statistical features of each, use these as inputs to RF classifier with output as the TS type, train and test. Another option is to use these stat features for clustering and check cluster quality. If accuracy is poor, add more stat features and repeat same process as outlined for training and testing. Assume that each time-series is named based on their type which will be used as target output. For example, temperature time series will be named as temperature?.csv where ? is an integer. Loop over each file in a directory, get stat and record in a dictionary/dataframe, train/test. Default to using RandomForest for classification of data types.

source

TSML.TSMLTypes.fit! — Function

fit!(tsc::TSClassifier, features::T=[], labels::Vector=[]) where {T<:Union{Vector,Matrix,DataFrame}}

Get the stats of each file, collect as dataframe, and train.

source

TSML.TSMLTypes.transform! — Function

transform!(tsc::TSClassifier, features::T=[]) where {T<:Union{Vector,Matrix,DataFrame}}

Apply the learned parameters to the new data.

source

TSML.TSMLTypes.fit! — Method

fit!(tr::Transformer, instances::T, labels::Vector) where {T<:Union{Vector,Matrix,DataFrame}}

Generic fit! function to be redefined using multidispatch in different subtypes of Transformer.

source

TSML.TSMLTypes.transform! — Method

transform!(tr::Transformer, instances::T) where {T<:Union{Vector,Matrix,DataFrame}}

Generic transform! function to be redefined using multidispatch in different subtypes of Transformer.

source