Using basic features within the same patient

In this notebook we will be using a very simple feature extraction and then train a very simple classifier within the same patient (we take data from one recording and split it in a train/test split) to assess the differences that may arise between patients. We expect to see that almost all of the patients behave similarly, but we could be surprised.

import os
from glob import glob
from collections import Counter
from typing import List, Dict

from rich.progress import track
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import mne
import yasa

from sklearn.model_selection import train_test_split, cross_validate
from sklearn.ensemble import RandomForestClassifier

from sleepstagingidal.data import *
from sleepstagingidal.dataa import *
from sleepstagingidal.feature_extraction import *

path_files = glob(os.path.join(path_data, "*.edf"))

channels = ["C3", "C4", "A1", "A2", "O1", "O2", "LOC", "ROC", "LAT1", "LAT2", "ECGL", "ECGR", "CHIN1", "CHIN2"]

Looping through patients

As we only want to perform a very basic check, we are going to be looping through all the patients.

results = {}

for path in track(path_files):
    file_name = path.split("/")[-1]
    raw = read_clean_edf(path, resample=100, bandpass=(0.3, 49))
    epochs, sr = get_epochs(raw, channels=channels)
    bandpowers = calculate_bandpower(epochs, sf=sr)
    labels = epochs.events[:,-1]
    results_cv = cross_validate(RandomForestClassifier(random_state=42), bandpowers, labels)
    results[file_name] = results_cv['test_score']

We can put the logged results into a DataFrame and save them as .csv to avoid having to repeat the experiment:

df = pd.DataFrame.from_dict(results, orient='index')
df.index.set_names("File", inplace=True)
df['Mean'] = df.mean(axis=1)
df['Std'] = df.std(axis=1)
df.head()

	0	1	2	3	4	Mean	Std
File
PSG29.edf	0.548611	0.472222	0.444444	0.321678	0.517483	0.460888	0.078328
PSG12.edf	0.411765	0.392157	0.614379	0.398693	0.473684	0.458136	0.083295
PSG17.edf	0.349693	0.561728	0.537037	0.506173	0.209877	0.432902	0.133771
PSG10.edf	0.657895	0.801325	0.629139	0.344371	0.576159	0.601778	0.148749
PSG23.edf	0.873418	0.815287	0.821656	0.878981	0.605096	0.798887	0.100312

df.to_csv("Results/00_basic_features_own_patient.csv")