This module is used to collect the distance-based metrics used to evaluate the novelty content of synthetically generated datasets.

The module evaluates the Gower’s distance of each row of x_dataframe with respect to each row in y_dataframe, eventually leading to information such as Distance to Closest Record (DCR) and DCR Share (see validation_dcr_test() below for more information).

APIs


distance_to_closest_record()

dcr_stats()

number_of_dcr_equal_to_zero()

validation_dcr_test()

Examples


The argument dcr_name, **given as the first argument in most of these functions, serves as an indication for the report produced with the function report().

The functions accept one of the following for the dcr_name argument:

It is possible to specify the path where to save the new information computed in a json file.

dcr_synth_train       = distance_to_closest_record("synth_train", 
																									 synth_data, 
																									 real_data, 
																									 path_to_json="path/to/json/")
																									 
dcr_synth_valid       = distance_to_closest_record("synth_val", 
																									 synth_data, 
																									 valid_data, 
																									 path_to_json="path/to/json/")
																									 
dcr_stats_synth_train = dcr_stats("synth_train", 
																	dcr_synth_train, 
																	path_to_json="path/to/json/")
																	
dcr_stats_synth_valid = dcr_stats("synth_val", 
																	dcr_synth_valid, 
																	path_to_json="path/to/json/")
																	
dcr_zero_synth_train  = number_of_dcr_equal_to_zero("synth_train", 
																										dcr_synth_train, 
																										path_to_json="path/to/json/")
																										
dcr_zero_synth_valid  = number_of_dcr_equal_to_zero("synth_val", 
																										dcr_synth_valid, 
																										path_to_json="path/to/json/")
																										
share                 = validation_dcr_test(dcr_synth_train, 
																						dcr_synth_valid, 
																						path_to_json="path/to/json/")

Note that if the json file already exists in the specified directory, the new information is appended or, if already present in the file, updated.