Utils
cleanup_sina_record_from_kosh_sync(record)
Kosh adds data in the 'user_defined' section of records to keep track of syncing This removes these attributes
Parameters:
Name | Type | Description | Default |
---|---|---|---|
record |
sina.model.Record
|
The Sina record to cleanup |
required |
Returns:
Type | Description |
---|---|
dict
|
json loaded representation of the record |
Source code in kosh/utils.py
compute_fast_sha(uri, n_samples=10)
Compute a fast 'almost' unique identifier for a given uri Assumes the uri is a path to a file, otherwise simply return hexdigest of md5 on the uri string
If uri path is valid the 'fast' sha is used by creating an hashlib from
* file size
* file first 2kb
* file last 2kb
* 2k samples read from n_samples
evenly spaced in the file
Warning if size is unchanged and data is changed somewhere else than those samples the sha will be identical
Parameters:
Name | Type | Description | Default |
---|---|---|---|
uri |
str
|
URI to compute fast_sha on |
required |
n_samples |
Number of samples to extract from uri (in addition to beg and end of file) |
10
|
Returns:
Type | Description |
---|---|
str
|
hexdigested sha |
Source code in kosh/utils.py
compute_long_sha(uri, buff_size=65536)
Computes sha for a given uri
Parameters:
Name | Type | Description | Default |
---|---|---|---|
uri |
str
|
URI to compute fast_sha on |
required |
buff_size |
int
|
How much data to read at once |
65536
|
Returns:
Type | Description |
---|---|
str
|
hexdigested sha |
Source code in kosh/utils.py
create_kosh_users(record_handler, users=[os.environ.get('USER', 'default'), 'anonymous'])
Add Kosh user to the Kosh store
Parameters:
Name | Type | Description | Default |
---|---|---|---|
record_handler |
sina.records
|
The sina records object |
required |
users |
list
|
list of usernames to add |
[get('USER', 'default'), 'anonymous']
|
Source code in kosh/utils.py
create_new_db(name, db='sql', keyspace=None, **kargs)
create_new_db creates a new Kosh database, adds a single user
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name |
str
|
name of database |
required |
db |
str, optional
|
type of database, defaults to 'sql', can be 'cass' |
'sql'
|
keyspace |
str, optional
|
for cassandra keyspace to use, defaults to None means [user]_k |
None
|
kargs |
dict
|
Any additional key/value pairs you need to pass to store creation |
{}
|
Returns:
Type | Description |
---|---|
KoshStoreClass
|
An handle to the Kosh store created |
Source code in kosh/utils.py
datasets_in_place_of_records(func)
This decorator will convert all Record input or output to KoshDataset This allows a user to use sina functions that expect Record with Kosh datasets instead
Source code in kosh/utils.py
find_curveset_and_curve_name(name, rec)
Given a curveset or curveset+curve name, returns all matching curve_sets and curves combinations Assumes curveset and curve are separated by a / curve_sets that exactly match the name return (name, None)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name |
str
|
Name to parse |
required |
rec |
sina record
|
sina record where to look for curves |
required |
Returns:
Type | Description |
---|---|
tuple of tuples
|
All possible combinations of (curveset,curve) that match name |
Source code in kosh/utils.py
gen_labels(G)
Generates labels to draw on networkx plots of a graph
Parameters:
Name | Type | Description | Default |
---|---|---|---|
G |
networkx.DiGraph (OrderedDiGraph on older version of networkx)
|
Network to generate labels from |
required |
Returns:
Type | Description |
---|---|
dict
|
labels for this graph |
Source code in kosh/utils.py
get_graph(input_type, loader, transformers)
Given a loader and its transformer return path to desired format e.g which output format should each transformer pick to be chained to the following one in order to obtain the desired outcome for format
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_type |
str
|
input type of first node |
required |
loader |
KoshLoader
|
original loader |
required |
transformers |
list of KoshTransformer
|
set of transformers to be added after loader exits |
required |
Returns:
Type | Description |
---|---|
networkx.OrderDiGraph
|
execution graph |
Source code in kosh/utils.py
merge_datasets_handler(target_dataset, imported_dataset, section='data', **kargs)
When importing a dataset, checks if the imported dataset has attributes that match the one in the dataset already in this store. If attributes values conflict then we use 'handling_method to resolve the conflict
The store_dataset is not updated here, we return a list of attributes/values pairs resolving the conflict
Parameters:
Name | Type | Description | Default |
---|---|---|---|
target_dataset |
kosh.KoshDataset
|
The dataset that will received the merge |
required |
imported_dataset |
kosh.KoshDataset | dict
|
The dataset we are trying to merge into target_dataset or its attributes/values dictionary |
required |
section |
str
|
The section being updated (data, user_defined, curves, etc...) |
'data'
|
handling_method |
How do we handle conflicts? None, "conservative": Error exit "preserve": Keep value from target_dataset "overwrite": Use value from imported dataset |
required |
Returns:
Type | Description |
---|---|
dict
|
Dictionary of attribute/value that the target_dataset should have |
Source code in kosh/utils.py
record_to_dataset(record)
Converts a Sina record to a KoshDatset
Parameters:
Name | Type | Description | Default |
---|---|---|---|
record |
sina.model.Record
|
The Sina Record to convert |
required |
Returns:
Type | Description |
---|---|
kosh.KoshDatset
|
kosh version of the record |
Source code in kosh/utils.py
update_store_and_get_info_record(records, ensemble_predicate=None)
Obtain the sina record containing store info If necessary update store to latest standards
Parameters:
Name | Type | Description | Default |
---|---|---|---|
records |
sina.datastore.DataStore.RecordOperations
|
The sina store "records" object |
required |
ensemble_predicate |
str
|
The predicate for the relationship to an ensemble |
None
|
Returns:
Type | Description |
---|---|
Record
|
sina record for store info |
Source code in kosh/utils.py
330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 |
|
version(comparable=False)
Returns version string
Parameters:
Name | Type | Description | Default |
---|---|---|---|
comparable |
bool
|
returns version as a tuple of ints so it can be compared |
False
|
Returns:
Type | Description |
---|---|
str | tuple
|
version string or tuple |
Source code in kosh/utils.py
walk_dictionary_keys(dictionary, separator='/')
Walks through a dictionary and return all levels of keys sub dictionary keys are append to parent key with the 'separator'
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dictionary |
dict
|
The dictionary to walk |
required |
separator |
str
|
The string to use between a parent key and its children |
'/'
|
Returns:
Type | Description |
---|---|
generator
|
generator of keys and possibly their sub keys |