EpiDataVault#

epilepsy_tools.epidatavault.build_patient_datavault(annotations, patient_numbers, seizure_types=None, log18=None, log23=None, save_path=None)#

Extract patient annotations for a list of patients.

Parameters:
  • annotations (pandas.ExcelFile) – Excel file containing patient annotations.

  • patient_numbers (list[str]) – List of patient numbers in the format pXXX.

  • seizure_types (list[str] | None, optional) – List of seizure types to extract information for, by default None.

  • log18 (pandas.DataFrame | None, optional) – DataFrame containing patient log information from 2018, by default None.

  • log23 (pandas.DataFrame | None, optional) – DataFrame containing patient log information from 2023, by default None.

  • save_path (str | os.PathLike | None, optional) – Path to save the extracted data, by default None.

Returns:

A DataFrame containing the extracted patient annotation information.

The DataFrame will have the following columns:

  • patient_num (str): Patient number in the format “pXXX”.

  • patient_id (str): Patient ID (CHUM file number).

  • patient_name (str): Name of the patient.

  • start_date (pandas.Timestamp): Date and time when annotation monitoring started, in “YYYY-MM-DD HH:MM:SS” format.

  • end_date (pandas.Timestamp): Date and time when annotation monitoring ended, in “YYYY-MM-DD HH:MM:SS” format.

  • num_seizures (dict[str, int]): A dictionary containing the count of each seizure type for the patient.

Return type:

pandas.DataFrame

Raises:

ValueError – Raised when no pandas.DataFrame is passed for neither log18 or log23.

epilepsy_tools.epidatavault.build_seizure_datavault(annotations, patient_numbers, seizure_types=None, save_path=None)#

Extract seizure annotations for a list of patients.

Parameters:
  • annotations (pandas.ExcelFile) – Excel file containing patient annotations.

  • patient_numbers (list[str]) – List of patient numbers in the format ‘pXXX’.

  • seizure_types (list[str] | None, optional) – List of seizure types to extract annotations for.

  • save_path (str | os.PathLike | None, optional) – Path to save the extracted data.

Returns:

DataFrame containing seizure annotations for the specified patients.

Each row in the DataFrame represents a seizure, and the columns are as follows:

  • p_num (str): Patient number in the format pXXX.

  • sz_id (int): Seizure number that each patient had.

  • sz_type (str): Seizure classification according to the ILAE classification.

  • sz_date (Timestamp): Date when the seizure happened.

  • electric_onset (Timestamp): Time when the electrical seizure activity began.

  • clinical_onset (Timestamp): Time when clinical seizure manifestations started.

  • generalization (Timestamp): Time when generalization of the seizure manifestations started.

  • motor_onset (Timestamp): Time when motor seizure manifestations started.

  • sz_offset (Timestamp): Time when the seizure ended.

Return type:

pandas.DataFrame

Notes

The format of all onsets and offset is the same: - All are in the format “YYYY-MM-DD HH:MM:SS”.

epilepsy_tools.epidatavault.count_seizures(annotations, seizure_types)#

Count the number of seizures of each type for a patient. If no seizure types are provided, all seizure types will be counted.

Parameters:
  • annotations (pandas.DataFrame) – DataFrame containing patient annotations.

  • seizure_types (list[str]) – List of seizure types to count.

Returns:

An int of the count of total seizures.

Return type:

int

epilepsy_tools.epidatavault.extract_annotation_dates(annotations, patient_number)#

Extract the start and end dates of the annotations for a patient.

Parameters:
  • annotations (pandas.ExcelFile) – DataFrame containing patient annotations.

  • patient_number (str) – Patient number in the format pXXX.

Returns:

A tuple containing the start and end dates of the annotations.

Return type:

tuple[pandas.Timestamp, pandas.Timestamp]

epilepsy_tools.epidatavault.extract_seizure_info(annotations, patient_number, seizure_types)#

Extract seizure information for a patient.

Parameters:
  • annotations (pandas.DataFrame) – DataFrame containing patient annotations.

  • patient_number (str) – Patient number in the format pXXX.

  • seizure_types (list[str]) – List of seizure types to extract information for.

Returns:

A dictionary containing the extracted seizure information.

The dictionary will have the folowing keys:

  • p_num (list[str]): Patient number in the format pXXX.

  • sz_id (list[int]): Seizure number that each patient had.

  • sz_type (list[str]): Seizure classification according to the ILAE classification.

  • sz_date (list[pandas.Timestamp]): Date when the seizure happened.

  • electric_onset (list[pandas.Timestamp]): Time when the electrical seizure activity began.

  • clinical_onset (list[pandas.Timestamp]): Time when clinical seizure manifestations started.

  • generalization (list[pandas.Timestamp]): Time when generalization of the seizure manifestations started.

  • motor_onset (list[pandas.Timestamp]): Time when motor seizure manifestations started.

  • sz_offset (list[pandas.Timestamp]): Time when the seizure ended.

Return type:

dict

Notes

The format of all onsets and offset is the same: - All are in the format “YYYY-MM-DD HH:MM:SS”.

epilepsy_tools.epidatavault.generate_patient_numbers_list(annotations, selection='all', parient_range=None)#

Generate a list of patient numbers based on the specified selection mode.

Parameters:
  • annotations (pandas.ExcelFile) – Excel file containing patient annotations.

  • selection (Literal["all", "range"], optional) – Selection mode. "all": Extracts patient numbers from the sheet names in the annotations file (default). "range": Generates patient numbers within a specified range.

  • parient_range (list[int] | None, optional) – A list containing two integers [start, end] for range selection, by default None.

Returns:

A list of patient numbers in the format ‘pXXX’.

Return type:

list[str]

Raises:

ValueError – Error loading the Excel file, or invalid arguments were passed.

epilepsy_tools.epidatavault.load_annotation_file(annotations_path)#

Load the annotation file and validate essential columns.

Parameters:

annotations_path (str) – Path to the Excel file containing the annotations.

Returns:

The loaded annotations file.

Return type:

pandas.ExcelFile

epilepsy_tools.epidatavault.load_patient_log(log_path, log_type, password=None, header=1)#

Load log18 or log23, decrypting if neccessary, and returns cleaned DataFrames.

Parameters:
  • log_path (str) – Path to the Excel file.

  • log_type (str) – Either "log18" or "log23" (plain).

  • password (str | None, optional) – Password to decrypt the file (only for log18), by default None.

  • header (int, optional) – Row number to use as column names. In date of creation, header=1 is functional, if changed, verify in log, by default 1.

Returns:

Cleaned DataFrame.

Return type:

pandas.DataFrame

Raises:

ValueError – Provided log_type not allowed.

Examples#

See examples in EpiDataVault Examples.