performance.statistical_significance

Authors: Arthur Rodrigues Scarpatto and Leonardo de Sousa Marques Affiliation: Embedded Computing Lab (ECL), Federal University of Santa Catarina (UFSC)

Description:: Calculates the statistical significance of speedup results based on the Wilcoxon-Mann-Whitney test. Uses configuration file to automatically find baseline and codec results paths.

class SoftwareType(*values)[source]

Bases: Enum

ENCODER = 'encoder'

DECODER = 'decoder'

exception ArgumentMismatchError(arg_name, file)[source]

Bases: ValueError

Exception raised when a given Lightfield or BPP configuration is not present in a result when it should be.

Parameters:

arg_name (str)
file (str)

__init__(arg_name, file)[source]

Initializes ArgumentMismatchError.

Parameters:

arg_name (str) – Name of the argument that caused the mismatch
file (str) – Path or identifier of the file where the mismatch occurred

Returns:

None

Return type:

None

log(msg, verbose=False)[source]

Logs a message to stdout, with optional verbose filtering.

Parameters:

msg (str) – Message to log
verbose (bool) – If True, only logs when VERBOSITY > 0

Returns:

None

Return type:

None

komolgorov_smirnov(X, Y)[source]

Performs the Komolgorov-Smirnov two-sample test to check if two distributions differ only by a shift.

Given two samples X and Y containing floating point values, normalized by the median and sorted ascendingly, calculates whether the distributions of X and Y are the same, shifting only by a Delta (Fy(t) = Fx(t + delta)).

Parameters:

X (List[float]) – First sample of floating point values
Y (List[float]) – Second sample of floating point values

Returns:

True if distributions are the same (differ only by delta), False otherwise

Return type:

bool

extract_log_execution_times(filepath, software_type)[source]

Given a file representing an execution log in the new format, extracts from it a dictionary of the form {

“lf_1”: {
“bpp_1”: [sample_1, sample_2, …], (…) “bpp_n”: [sample_n, …],

}, (…) “lf_n”: {

“bpp_1”: [sample_1, …], (…) “bpp_n”: [sample_n, …],

}

Uses the new log format with log_data containing time_ns directly.

Parameters:

filepath (Path)
software_type (SoftwareType)

Return type:

Dict[str, Dict[float, List[float]]]

get_statistical_significances(X, Y)[source]

Calculates Wilcoxon-Mann-Whitney statistical significance between baseline and result samples.

Follows the flow outlined by Touati et al. (2012).

Parameters:

X (List[float]) – Baseline sample
Y (List[float]) – Result sample

Returns:

Tuple of (is_significant, message)

Return type:

Tuple[bool, str]

find_codec_json_file(results_path)[source]

Finds the JSON log file in the results path.

Parameters:: results_path (Path) – Path to the results directory
Returns:: Path to the first JSON file found
Return type:: Path
Raises:: FileNotFoundError – If no JSON file is found in the path

main()[source]

Return type:: None