Running Kosh from the Command Line¶
Sometimes you do not want to write a script to do simple queries our operation.
Starting with Kosh 0.8 a kosh
command line utility help you perform simple tasks from the prompt
Table of Content
- Basics
- Finding Datasets
- Printing Datasets
- Adding Datasets
- Removing Datasets
- (Dis)Associating Files with Datasets
- Features
Basics¶
To learn available Kosh commands from the command line simply type:
kosh --help
This will print something like this:
usage: kosh <command> [<args>]
Available commands are:
add
associate
cleanup_files
cp
create
create_ensemble
create_new_db
dissociate
extract
fast_sha
features
find
htar
mv
print
reassociate
remove
rm
tar
Execute kosh operations
positional arguments:
command Subcommand to run
optional arguments:
-h, --help show this help message and exit
--store STORE, -s STORE
Kosh store to use (default: None)
--dataset_record_type DATASET_RECORD_TYPE, -d DATASET_RECORD_TYPE
type used by sina db that Kosh will recognize as
dataset (default: dataset)
--version, -v print version and exit (default: False)
Kosh version 1.2
Preping the notebook¶
Let's copy a sample Kosh store locally
import shutil
shutil.copy("../tests/baselines/sina/data.sqlite", "cmd_line.sql")
Help messages¶
In addition to the top help message each command has its own help:
kosh find --help
Returns
usage: kosh [-h] --store STORE [--dataset_record_type DATASET_RECORD_TYPE]
[--print]
Find Kosh store for datasets matching metadata in form key=value
optional arguments:
-h, --help show this help message and exit
--store STORE, -s STORE
Kosh store to use (default: None)
--dataset_record_type DATASET_RECORD_TYPE, -d DATASET_RECORD_TYPE
type used by sina db that Kosh will recognize as
dataset (default: dataset)
--print, -p print each dataset info (default: False)
Create a store¶
You can create a new empty Kosh store by issuing:
kosh create_new_db --uri my_store.sql
Populate the store¶
You can populate a store with new dataset
Metadata (attributes) values on the dataset will be evaluated, make sure to use proper quotation to ensure the correct type in the store:
The following line will create a dataset with 3 attributes: paramint
, paramfloat
and paramstr
. Pay attention to the necessary escaping for the string attribute.
kosh create --store=my_store.sql paramint=2 paramfloat 2.4 paramstr "'45'"
Finding a store¶
Assuming you ran the tutorials you should have a Kosh example sql here, let's find it
This is a raw sina db where datasets are identified by record type obs
, so we will use the -d
option to let kosh know about it
kosh find --store cmd_line.sql -d obs
Returns 10 datasets
2019-05-06-08-51-31
2018-10-10-02-51-38
2020-07-05-04-21-21
2019-04-05-02-11-29
2018-04-06-05-31-13
2019-07-05-13-51-12
2014-04-05-06-21-43
2017-06-04-03-41-31
2018-04-05-15-31-33
2019-08-03-19-11-27
The print
option will give us detailed output for each datasets
kosh find --store cmd_line.sql -d obs --print
Returns 10 datasets
KOSH DATASET
id: 2017-06-04-03-41-31
name:???
creator: ???
--- Attributes ---
PARAM1: 241.289
PARAM2: 184.63
PARAM3: 16519.23
PARAM4: 997625.0
date: 6/4/2017
latitude: 57.756035
longitude: 40.961206
time: 3:41:31 AM
--- Associated Data (0)---
=======================================================================
KOSH DATASET
id: 2014-04-05-06-21-43
name:???
creator: ???
[snip]
Let's find a subset of parameters based on some key'values
kosh find --store cmd_line.sql -d obs `PARAM2<=172'
Shows only 5 datasets:
2014-04-05-06-21-43
2019-04-05-02-11-29
2018-04-06-05-31-13
2018-04-05-15-31-33
2019-07-05-13-51-12
Printing a dataset info¶
On can print info about a single dataset:
kosh print -s cmd_line.sql -d obs -i 2019-04-05-02-11-29
Shows:
KOSH DATASET
id: 2019-04-05-02-11-29
name:???
creator: ???
--- Attributes ---
PARAM1: 143.557
PARAM2: 163.28
PARAM3: 20418.05
PARAM4: 997353.0
date: 4/5/2019
latitude: 37.762307
longitude: 140.959459
time: 2:11:29 AM
--- Associated Data (0)---
=======================================================================
Adding a dataset to the store¶
kosh add -s 'cmd_line.sql' -d obs -i '2020-03-11-13-45-23' PARAM1=156 PARAM2=.2 PARAM3="something"
Removing a dataset from the store¶
kosh remove -s 'cmd_line.sql' -d obs -i '2020-03-11-13-45-23'
Goes:
KOSH DATASET
id: 2020-03-11-13-45-23
name:Unnamed Dataset
creator: anonymous
--- Attributes ---
PARAM1: 156
PARAM2: 0.2
PARAM3: something
creator: anonymous
name: Unnamed Dataset
--- Associated Data (0)---
You are about the remove this dataset (2020-03-11-13-45-23). Do you want to continue? (y/N)y
(Dis)Associating a file with a dataset¶
We can associate files and their mime_type to a dataset
kosh associate -s 'cmd_line.sql' -d obs -i '2019-04-05-02-11-29' -u ../tests/baselines/node_extracts2/node_extracts2.hdf5 -m hdf5
The opposite is also possible:
kosh dissociate -s 'cmd_line.sql' -d obs -i '2019-04-05-02-11-29' -u ../tests/baselines/node_extracts2/node_extracts2.hdf5
Features associated with a dataset¶
Listing features¶
Let's associate a file and list features"
kosh associate -s 'cmd_line.sql' -d obs -i '2019-04-05-02-11-29' -u ../tests/baselines/node_extracts2/node_extracts2.hdf5 -m hdf5
kosh features -s 'cmd_line.sql' -d obs -i '2019-04-05-02-11-29'
Returns:
Dataset: 2019-04-05-02-11-29:
['cycles', 'direction', 'elements', 'node/metrics_0', 'node/metrics_1', 'node/metrics_10', 'node/metrics_11', 'node/metrics_12', 'node/metrics_2', 'node/metrics_3', 'node/metrics_4', 'node/metrics_5', 'node/metrics_6', 'node/metrics_7', 'node/metrics_8', 'node/metrics_9', 'zone/metrics_0', 'zone/metrics_1', 'zone/metrics_2', 'zone/metrics_3', 'zone/metrics_4']
Extracting features¶
Features can also be extracted:
kosh extract -s 'cmd_line.sql' -d obs -i '2019-04-05-02-11-29' -f node/metrics_4 zone/metrics_4 --dump sample.npy
Cleaning up dead files¶
Although Kosh helps you manage your files (see this notebook for more details), from time to time files are moved outside of the Kosh environment.
If a file has be moved or renamed you can try to reassociate it:
cp ../README.md my_file.md
kosh associate -s 'cmd_line.sql' -d obs -i '2019-04-05-02-11-29' -u my_file.md -m md
kosh print -s 'cmd_line.sql' -d obs -i '2019-04-05-02-11-29'
mv my_file.md my_new_name.md
kosh reassociate -s 'cmd_line.sql' -d obs -n my_new_name.md
kosh print -s 'cmd_line.sql' -d obs -i '2019-04-05-02-11-29'
Sometimes though it is possible the file was removed all together, in this case you need to clean up the store
rm my_new_name.md
# let's do a dry run first
kosh cleanup_files --dry-run -s 'cmd_line.sql' -d obs
# And interactive cleaning limiting to the md mime_type
kosh cleanup_files -s 'cmd_line.sql' -d obs -i mime_type=md
kosh print -s 'cmd_line.sql' -d obs -i '2019-04-05-02-11-29'