Skl
DBSCAN
Bases: SKL
SKL-based DBSCAN Kosh classifier This transformer returns either: * an estimator or * a set of labels/numpy arrays for each class found by the estimator If you chose to return the arrays (format=numpy) you can control how much data is return for each class/label When initiating the transformer you can pass any argument necessary for the SKL classifier initialization
Source code in kosh/transformers/skl.py
KMeans
Bases: SKL
SKL-based KMeans Kosh classifier This transformer returns either: * an estimator or * a set of labels/numpy arrays for each class found by the estimator If you chose to return the arrays (format=numpy) you can control how much data is return for each class/label When initiating the transformer you can pass any argument necessary for the SKL classifier initialization
Source code in kosh/transformers/skl.py
SKL
Bases: KoshTransformer
base class for SKL-based Kosh classifier This transformer returns either: * an estimator or * a set of labels/numpy arrays for each class found by the estimator If you chose to return the arrays (format=numpy) you can control how much data is return for each class/label When initiating the transformer you can pass any argument necessary for the SKL classifier initialization
Source code in kosh/transformers/skl.py
149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 |
|
__init__(*args, **kargs)
initialize Kosh classifier
Parameters:
Name | Type | Description | Default |
---|---|---|---|
n_samples |
float
|
number/percent of samples to send back for each class |
required |
sampling_method |
float
|
units of percent random, or random_percent units to returnthe number of samples unit: return the first "n_samples" percent: return the first n_samples % of the class random_unit: return 'n_samples' random point from each class random_percent: return n_samples % of the class randomly |
required |
random_state |
int
|
random state for reproducibility |
required |
skl_class |
sklearn classifier
|
SKL classifier |
required |
Returns:
Type | Description |
---|---|
estimator from classifier.fit(input) function or labels, list of ndarray with samples in each class possibly sub-sampled via n_sample/sampling_method |
Source code in kosh/transformers/skl.py
transform(input, format)
If format is numpy
scales the input data
If format is estimator
returns the estimator
Possibly pads the ends with a value
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input |
ndarray
|
array from previous loader or transformer |
required |
format |
str
|
output format |
required |
Returns:
Type | Description |
---|---|
input taken over transformer's axis and indices |
Source code in kosh/transformers/skl.py
Splitter
Bases: KoshTransformer
SKL-based class to split dataset into test, train and validation
Source code in kosh/transformers/skl.py
6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 |
|
__init__(train_size=None, test_size=None, validation_size=None, splitter=None, random_state=None, n_splits=1, *args, **kargs)
Initialize Splitt Transformer
At least 2 of train/test/validation size are required, must add to 100%
Parameters:
Name | Type | Description | Default |
---|---|---|---|
train_size |
float
|
size of the dataset to reserve for training (.9 = 90% default) |
None
|
test_size |
float
|
size of the dataset to reserve for testing (.1 = 10% default) |
None
|
validation_size |
float
|
size of the dataset to reserve for validating (.0 = 0% default) |
None
|
splitter |
sklearn.model_selection Splitter class
|
SKL splitter to use to split Same one will be used to first select training set and then split again the rest between test and validation default: sklearn.model_selection.ShuffleSplit |
None
|
random_state |
int
|
random state for reproducibility Controls the randomness of the training and testing indices produced. Pass an int for reproducible output across multiple function calls. |
None
|
n_splits |
int
|
total number of split iteration to generate (default 1) |
1
|
groups |
list | None
|
split data according to a third-party provided group. This group information can be used to encode arbitrary domain specific stratifications of the samples as integers. For instance the groups could be the year of collection of the samples and thus allow for cross-validation against time-based splits. |
required |
Returns:
Type | Description |
---|---|
Splitter
|
initialzed Splitter transformer |
Source code in kosh/transformers/skl.py
transform(input, format)
Split input data between n_splits sets of training/test/validation
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input |
ndarray
|
array from previous loader or transformer |
required |
format |
str
|
output format |
required |
Returns:
Type | Description |
---|---|
n_split list of train, test, validation ndarrays
|
n_splits sets of train/test/validation |
Source code in kosh/transformers/skl.py
StandardScaler
Bases: KoshTransformer
Source code in kosh/transformers/skl.py
__init__(*args, **kargs)
SKL-based scaler transformer
transform(input, format)
calls the fit_transform
function of the scaler on the input data
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input |
ndarray
|
array from previous loader or transformer |
required |
format |
str
|
output format |
required |
Returns:
Type | Description |
---|---|
ndarray
|
scaled input data |