colda.workflow package

Subpackages

Submodules

colda.workflow.abstract_workflow module

class colda.workflow.abstract_workflow.AbstractTestMainWorkflow

Bases: ABC

Abstract class for test main workflow

abstract find_test_assistor(**kwargs)

abstract classmethod get_instance()

abstract test_assistor_match_identifier(**kwargs)

abstract test_assistor_request(**kwargs)

abstract test_match_identifier(**kwargs)

abstract test_output(**kwargs)

abstract test_sponsor_match_identifier(**kwargs)

class colda.workflow.abstract_workflow.AbstractTrainMainWorkflow

Bases: ABC

Abstract class for train main workflow

abstract find_assistor(**kwargs)

abstract classmethod get_instance()

abstract stop_train(**kwargs)

abstract train_assistor_match_identifier(**kwargs)

abstract train_assistor_request(**kwargs)

abstract train_assistor_situation(**kwargs)

abstract train_match_identifier(**kwargs)

abstract train_output(**kwargs)

abstract train_situation(**kwargs)

abstract train_sponsor_match_identifier(**kwargs)

abstract train_sponsor_situation(**kwargs)

colda.workflow.api module

class colda.workflow.api.TestMainWorkflow

Bases: AbstractTestMainWorkflow

Manage test workflow

Methods

get_instance find_test_assistor test_assistor_request test_match_identifier test_sponsor_match_identifier test_assistor_match_identifier test_output

classmethod find_test_assistor(train_id: str, test_file_path: str, test_id_column: str, test_data_column: str, test_target_column: str, test_name: str | None = None, test_description: str | None = None) → str: Start testing with all assistors of the previous train task

Parameters

train_id : str test_file_path : str test_id_column : str test_data_column : str test_target_column : str test_name : str=None test_description : str=None

Returns

Any

classmethod get_class() → type[workflow.test_main_workflow.TestMainWorkflow]: Get current class.

Returns

type[TestMainWorkflow]

classmethod test_assistor_match_identifier(test_id: str, test_id_dict: dict[str, str]) → bool

Match_identifier is second stage of testing for assistor. In this stage, assistor would:

Match the identifiers sent from the sponsor

Test the trained models produced at each round of training stage using test data

Send the test ouputs to sponsor

Parameters

test_id : str test_id_dicts : dict[str, str]

Returns

None

classmethod test_assistor_request(test_id_dicts: dict[dict[str, str]]) → None

Request is fist stage of testing for assistor. In this stage, assistor would:

Encrypt the identifiers

Send the identifiers to server

Parameters

test_id_dicts : dict[dict[str, str]]

Returns

None

classmethod test_match_identifier(test_id_dicts: dict[dict[str, str]]) → None: Handle the unread_test_match_identifier. Two situations needed to be considered: sponsor and assistor

Parameters

test_id_dicts : dict[dict[str, str]]

Returns

None

classmethod test_output(test_id_dicts: dict[dict[str, str]]) → None

Output is third stage of testing for sponsor. In this stage, sponsor would:

Get the test outputs sent from all the assistors

Evaluate test results

Parameters

test_id_dicts : dict[dict[str, str]]

Returns

None

classmethod test_sponsor_match_identifier(test_id: str, test_id_dict: dict[str, str]) → None

Match_identifier is second stage of testing for sponsor. In this stage, sponsor would:

Match the identifiers sent from all the assistors

Test the trained models produced at each round of training stage

Parameters

test_id : str test_id_dicts : dict[str, str]

Returns

None

class colda.workflow.api.TrainMainWorkflow

Bases: AbstractTrainMainWorkflow

Manage train workflow

Methods

get_instance find_test_assistor test_assistor_request test_match_identifier test_sponsor_match_identifier test_assistor_match_identifier test_output

classmethod find_assistor(max_round: int, assistors: list, task_mode: Literal['classification', 'regression'], model_name: Literal['linear', 'decision_tree', 'svm', 'gradient_boosting', 'mlp'], metric_name: Literal['MAD', 'RMSE', 'R2', 'Accuracy', 'F1', 'AUCROC'], train_file_path: str, train_id_column: str, train_data_column: str, train_target_column: str, task_name: str | None = None, task_description: str | None = None) → str: Start training

Parameters

maxRound : int assistors : list task_mode : Task_Mode model_name : Model_Name metric_name : Metric_Name train_file_path : str train_id_column : str train_data_column : str train_target_column : str task_name : str=None task_description : str=None

Returns

Any

classmethod get_class() → type[workflow.train_main_workflow.TrainMainWorkflow]: Get current class.

Returns

type[TrainMainWorkflow]

classmethod stop_train(unread_train_stop_notification: dict): Stop Train and delete related files. Implement later

classmethod train_assistor_match_identifier(train_id: str, train_id_dict: dict[str, str]) → None

Match_identifier is second stage of training for assistor. In this stage, assistor would:

Match the identifiers sent from all the assistors

Parameters

train_id : str train_id_dict : dict

Returns

None

classmethod train_assistor_request(train_id_dicts: dict[dict[str, str]]) → None

Request is fist stage of training for assistor. In this stage, assistor would:

Encrypt the identifiers

Send the identifiers to server

Parameters

train_id_dicts : dict[dict[str, str]]

Returns

None

classmethod train_assistor_situation(train_id: str, train_id_dict: dict) → None

situation is third stage of training for assistor. In this stage, assistor would:

Get the residual(training target) sent from the sponsor

Train model

Send trained model output to sponsor

Parameters

train_id : str train_id_dicts : dict

Returns

None

classmethod train_match_identifier(train_id_dicts: dict[dict[str, str]]) → None: Handle the unread_test_match_identifier. Two situations needed to be considered: sponsor and assistor

Parameters

test_id_dicts : dict[dict[str, str]]

Returns

None

classmethod train_output(train_id_dicts: dict[dict[str, str]]) → None

Output is fourth stage of testing for sponsor. In this stage, sponsor would:

Get the train outputs sent from all the assistors

Evaluate train results

Parameters

train_id_dicts : dict[dict[str, str]]

Returns

None

classmethod train_situation(train_id_dicts: dict[dict[str, str]]) → None: Handle the unread_train_situation. Two situations needed to be considered: sponsor and assistor

Parameters

train_id_dicts : dict[dict[str, str]]

Returns

None

classmethod train_sponsor_match_identifier(train_id: str, train_id_dict: dict[str, str]) → None

Match_identifier is second stage of training for sponsor. In this stage, sponsor would:

Match the identifiers sent from all the assistors

Calculate residual(training target)

Send residual to all assistors

Parameters

train_id : str train_id_dicts : dict

Returns

None

classmethod train_sponsor_situation(train_id: str, train_id_dict: dict) → None

situation is third stage of training for sponsor. In this stage, sponsor would:

Train model

Parameters

train_id : str train_id_dicts : dict

Returns

None

colda.workflow.base module

class colda.workflow.base.BaseWorkflow

Bases: object

Base class for workflow

Attributes

_skip_header _initial_round_num _url_prefix _max_round

Methods

None

colda.workflow.test_main_workflow module

class colda.workflow.test_main_workflow.TestMainWorkflow

Bases: AbstractTestMainWorkflow

Manage test workflow

Methods

get_instance find_test_assistor test_assistor_request test_match_identifier test_sponsor_match_identifier test_assistor_match_identifier test_output

classmethod find_test_assistor(train_id: str, test_file_path: str, test_id_column: str, test_data_column: str, test_target_column: str, test_name: str | None = None, test_description: str | None = None) → str: Start testing with all assistors of the previous train task

Parameters

train_id : str test_file_path : str test_id_column : str test_data_column : str test_target_column : str test_name : str=None test_description : str=None

Returns

Any

classmethod get_class() → type[colda.workflow.test_main_workflow.TestMainWorkflow]: Get current class.

Returns

type[TestMainWorkflow]

classmethod test_assistor_match_identifier(test_id: str, test_id_dict: dict[str, str]) → bool

Match_identifier is second stage of testing for assistor. In this stage, assistor would:

Match the identifiers sent from the sponsor

Test the trained models produced at each round of training stage using test data

Send the test ouputs to sponsor

Parameters

test_id : str test_id_dicts : dict[str, str]

Returns

None

classmethod test_assistor_request(test_id_dicts: dict[dict[str, str]]) → None

Request is fist stage of testing for assistor. In this stage, assistor would:

Encrypt the identifiers

Send the identifiers to server

Parameters

test_id_dicts : dict[dict[str, str]]

Returns

None

classmethod test_match_identifier(test_id_dicts: dict[dict[str, str]]) → None: Handle the unread_test_match_identifier. Two situations needed to be considered: sponsor and assistor

Parameters

test_id_dicts : dict[dict[str, str]]

Returns

None

classmethod test_output(test_id_dicts: dict[dict[str, str]]) → None

Output is third stage of testing for sponsor. In this stage, sponsor would:

Get the test outputs sent from all the assistors

Evaluate test results

Parameters

test_id_dicts : dict[dict[str, str]]

Returns

None

classmethod test_sponsor_match_identifier(test_id: str, test_id_dict: dict[str, str]) → None

Match_identifier is second stage of testing for sponsor. In this stage, sponsor would:

Match the identifiers sent from all the assistors

Test the trained models produced at each round of training stage

Parameters

test_id : str test_id_dicts : dict[str, str]

Returns

None

colda.workflow.train_main_workflow module

class colda.workflow.train_main_workflow.TrainMainWorkflow

Bases: AbstractTrainMainWorkflow

Manage train workflow

Methods

get_instance find_test_assistor test_assistor_request test_match_identifier test_sponsor_match_identifier test_assistor_match_identifier test_output

classmethod find_assistor(max_round: int, assistors: list, task_mode: Literal['classification', 'regression'], model_name: Literal['linear', 'decision_tree', 'svm', 'gradient_boosting', 'mlp'], metric_name: Literal['MAD', 'RMSE', 'R2', 'Accuracy', 'F1', 'AUCROC'], train_file_path: str, train_id_column: str, train_data_column: str, train_target_column: str, task_name: str | None = None, task_description: str | None = None) → str: Start training

Parameters

maxRound : int assistors : list task_mode : Task_Mode model_name : Model_Name metric_name : Metric_Name train_file_path : str train_id_column : str train_data_column : str train_target_column : str task_name : str=None task_description : str=None

Returns

Any

classmethod get_class() → type[colda.workflow.train_main_workflow.TrainMainWorkflow]: Get current class.

Returns

type[TrainMainWorkflow]

classmethod stop_train(unread_train_stop_notification: dict): Stop Train and delete related files. Implement later

classmethod train_assistor_match_identifier(train_id: str, train_id_dict: dict[str, str]) → None

Match_identifier is second stage of training for assistor. In this stage, assistor would:

Match the identifiers sent from all the assistors

Parameters

train_id : str train_id_dict : dict

Returns

None

classmethod train_assistor_request(train_id_dicts: dict[dict[str, str]]) → None

Request is fist stage of training for assistor. In this stage, assistor would:

Encrypt the identifiers

Send the identifiers to server

Parameters

train_id_dicts : dict[dict[str, str]]

Returns

None

classmethod train_assistor_situation(train_id: str, train_id_dict: dict) → None

situation is third stage of training for assistor. In this stage, assistor would:

Get the residual(training target) sent from the sponsor

Train model

Send trained model output to sponsor

Parameters

train_id : str train_id_dicts : dict

Returns

None

classmethod train_match_identifier(train_id_dicts: dict[dict[str, str]]) → None: Handle the unread_test_match_identifier. Two situations needed to be considered: sponsor and assistor

Parameters

test_id_dicts : dict[dict[str, str]]

Returns

None

classmethod train_output(train_id_dicts: dict[dict[str, str]]) → None

Output is fourth stage of testing for sponsor. In this stage, sponsor would:

Get the train outputs sent from all the assistors

Evaluate train results

Parameters

train_id_dicts : dict[dict[str, str]]

Returns

None

classmethod train_situation(train_id_dicts: dict[dict[str, str]]) → None: Handle the unread_train_situation. Two situations needed to be considered: sponsor and assistor

Parameters

train_id_dicts : dict[dict[str, str]]

Returns

None

classmethod train_sponsor_match_identifier(train_id: str, train_id_dict: dict[str, str]) → None

Match_identifier is second stage of training for sponsor. In this stage, sponsor would:

Match the identifiers sent from all the assistors

Calculate residual(training target)

Send residual to all assistors

Parameters

train_id : str train_id_dicts : dict

Returns

None

classmethod train_sponsor_situation(train_id: str, train_id_dict: dict) → None

situation is third stage of training for sponsor. In this stage, sponsor would:

Train model

Parameters

train_id : str train_id_dicts : dict

Returns

None

colda.workflow.utils module

class colda.workflow.utils.CheckSponsor

Bases: object

assistor: Final[str] = 'assistor'

sponsor: Final[str] = 'sponsor'

colda.workflow.utils.check_Algorithm_return_value(check_list, first_val, second_val)

Parameters:

first_val – String. The first val needs to check.
second_val – String. The second val needs to check.

Returns:

Boolean

Raises:

OSError – Placeholder.

colda.workflow.utils.handle_Algorithm_return_value(name, return_val, first_val, second_val)

Check if the return value returned by the Algorithm equals to the correct value, e.x. return_val[0] == first_val (‘200’), return_val[1] == second_val (‘make_train’)

Parameters:

name – String. The name of current return_val
return_val – String. Contains the status code, name, paths that are returned by Algorithm
first_val – String. The first value needs to be checked
second_val – String. The second value needs to be checked

Returns:

return_val that has been split

Raises:

OSError – Placeholder.

colda.workflow.utils.is_max_round_valid(max_round: int) → bool

colda.workflow.utils.load_file(file_address)

start task with all assistors

Parameters:

file_address – Integer. Maximum training round
file_content – List. The List of assistors’ usernames

Returns:

Tuple. Contains a string ‘handleTrainRequest successfully’ and the task id

Raises:

OSError – Placeholder.

colda.workflow.utils.obtain_notification_information(notification_dict: dict[str, Any], test_indicator: str = 'train') → tuple[str, str, int] | tuple[str, str, int, str]: Parse the notification dict

Parameters

notification_dict : dict[str, Any] test_indicator : str

Returns

tuple[str]

colda.workflow.utils.save_file(file_address, file_content)

start task with all assistors

Parameters:

file_address – Integer. Maximum training round
file_content – List. The List of assistors’ usernames

Returns:

Tuple. Contains a string ‘handleTrainRequest successfully’ and the task id

Raises:

OSError – Placeholder.