EpiDataVault#
- epilepsy_tools.epidatavault.build_patient_datavault(annotations, patient_numbers, seizure_types=None, log18=None, log23=None, save_path=None)#
Extract patient annotations for a list of patients.
- Parameters:
annotations (
pandas.ExcelFile) – Excel file containing patient annotations.patient_numbers (list[
str]) – List of patient numbers in the formatpXXX.seizure_types (list[
str] |None, optional) – List of seizure types to extract information for, by defaultNone.log18 (
pandas.DataFrame|None, optional) – DataFrame containing patient log information from 2018, by defaultNone.log23 (
pandas.DataFrame|None, optional) – DataFrame containing patient log information from 2023, by defaultNone.save_path (
str|os.PathLike|None, optional) – Path to save the extracted data, by defaultNone.
- Returns:
A DataFrame containing the extracted patient annotation information.
The DataFrame will have the following columns:
patient_num(str): Patient number in the format “pXXX”.patient_id(str): Patient ID (CHUM file number).patient_name(str): Name of the patient.start_date(pandas.Timestamp): Date and time when annotation monitoring started, in “YYYY-MM-DD HH:MM:SS” format.end_date(pandas.Timestamp): Date and time when annotation monitoring ended, in “YYYY-MM-DD HH:MM:SS” format.num_seizures(dict[str,int]): A dictionary containing the count of each seizure type for the patient.
- Return type:
- Raises:
ValueError – Raised when no
pandas.DataFrameis passed for neitherlog18orlog23.
- epilepsy_tools.epidatavault.build_seizure_datavault(annotations, patient_numbers, seizure_types=None, save_path=None)#
Extract seizure annotations for a list of patients.
- Parameters:
annotations (
pandas.ExcelFile) – Excel file containing patient annotations.patient_numbers (list[
str]) – List of patient numbers in the format ‘pXXX’.seizure_types (list[
str] |None, optional) – List of seizure types to extract annotations for.save_path (
str|os.PathLike|None, optional) – Path to save the extracted data.
- Returns:
DataFramecontaining seizure annotations for the specified patients.Each row in the DataFrame represents a seizure, and the columns are as follows:
p_num(str): Patient number in the formatpXXX.sz_id(int): Seizure number that each patient had.sz_type(str): Seizure classification according to the ILAE classification.sz_date(Timestamp): Date when the seizure happened.electric_onset(Timestamp): Time when the electrical seizure activity began.clinical_onset(Timestamp): Time when clinical seizure manifestations started.generalization(Timestamp): Time when generalization of the seizure manifestations started.motor_onset(Timestamp): Time when motor seizure manifestations started.sz_offset(Timestamp): Time when the seizure ended.
- Return type:
Notes
The format of all onsets and offset is the same: - All are in the format “YYYY-MM-DD HH:MM:SS”.
- epilepsy_tools.epidatavault.count_seizures(annotations, seizure_types)#
Count the number of seizures of each type for a patient. If no seizure types are provided, all seizure types will be counted.
- Parameters:
annotations (
pandas.DataFrame) – DataFrame containing patient annotations.seizure_types (list[
str]) – List of seizure types to count.
- Returns:
An
intof the count of total seizures.- Return type:
- epilepsy_tools.epidatavault.extract_annotation_dates(annotations, patient_number)#
Extract the start and end dates of the annotations for a patient.
- Parameters:
annotations (
pandas.ExcelFile) – DataFrame containing patient annotations.patient_number (
str) – Patient number in the formatpXXX.
- Returns:
A tuple containing the start and end dates of the annotations.
- Return type:
tuple[
pandas.Timestamp,pandas.Timestamp]
- epilepsy_tools.epidatavault.extract_seizure_info(annotations, patient_number, seizure_types)#
Extract seizure information for a patient.
- Parameters:
annotations (
pandas.DataFrame) –DataFramecontaining patient annotations.patient_number (
str) – Patient number in the formatpXXX.seizure_types (list[
str]) – List of seizure types to extract information for.
- Returns:
A dictionary containing the extracted seizure information.
The dictionary will have the folowing keys:
p_num(list[str]): Patient number in the formatpXXX.sz_id(list[int]): Seizure number that each patient had.sz_type(list[str]): Seizure classification according to the ILAE classification.sz_date(list[pandas.Timestamp]): Date when the seizure happened.electric_onset(list[pandas.Timestamp]): Time when the electrical seizure activity began.clinical_onset(list[pandas.Timestamp]): Time when clinical seizure manifestations started.generalization(list[pandas.Timestamp]): Time when generalization of the seizure manifestations started.motor_onset(list[pandas.Timestamp]): Time when motor seizure manifestations started.sz_offset(list[pandas.Timestamp]): Time when the seizure ended.
- Return type:
Notes
The format of all onsets and offset is the same: - All are in the format “YYYY-MM-DD HH:MM:SS”.
- epilepsy_tools.epidatavault.generate_patient_numbers_list(annotations, selection='all', parient_range=None)#
Generate a list of patient numbers based on the specified selection mode.
- Parameters:
annotations (
pandas.ExcelFile) – Excel file containing patient annotations.selection (Literal["all", "range"], optional) – Selection mode.
"all": Extracts patient numbers from the sheet names in the annotations file (default)."range": Generates patient numbers within a specified range.parient_range (list[int] |
None, optional) – A list containing two integers[start, end]for range selection, by defaultNone.
- Returns:
A list of patient numbers in the format ‘pXXX’.
- Return type:
list[
str]- Raises:
ValueError – Error loading the Excel file, or invalid arguments were passed.
- epilepsy_tools.epidatavault.load_annotation_file(annotations_path)#
Load the annotation file and validate essential columns.
- Parameters:
annotations_path (
str) – Path to the Excel file containing the annotations.- Returns:
The loaded annotations file.
- Return type:
- epilepsy_tools.epidatavault.load_patient_log(log_path, log_type, password=None, header=1)#
Load log18 or log23, decrypting if neccessary, and returns cleaned DataFrames.
- Parameters:
log_path (
str) – Path to the Excel file.log_type (
str) – Either"log18"or"log23"(plain).password (
str|None, optional) – Password to decrypt the file (only for log18), by defaultNone.header (
int, optional) – Row number to use as column names. In date of creation,header=1is functional, if changed, verify in log, by default 1.
- Returns:
Cleaned
DataFrame.- Return type:
- Raises:
ValueError – Provided
log_typenot allowed.
Examples#
See examples in EpiDataVault Examples.