How to calculate your sleep duration from your Apple Health data
In this article I will share how I calculated the sleep duration from the raw Apple Health Data pulled from Apple’s HealthKit API.
There are two ways I know of getting Apple Health Data:
export from your iPhone’s Health app
connect to the HealthKit API
Since I am doing this for an app, I want to make it easier on the user. Therefore, connecting to HealthKit’s API makes the most sense.
I’ve looked at the XML file from the iPhone export and the columns are similar so I think this can still be helpful if you’re looking to export the data from the iPhone.
Methodology used to aggregate daily sleep duration
Once we get access to my data the process I am using to calculate the daily sleep duration includes the following:
filter by “sampleType” column with “HKCategoryTypeIdentifierSleepAnalysis”.
Filter values column for 3, 4, or 5. I believe these are sleep states that are actually asleep but I do not know exactly.
Sometimes they show up as 3.0, 4.0, and 5.0.
Create a “sleep_date_column”. Since sleep is an inter-day event I will need to make an assumption using a cutoff time that will determine where a sleep record’s startDate should apply. Both Apple and Oura calculate the day’s sleep as the sleep you experienced from the night before into the morning.
I use a cutoff of 3pm. So any sleep that startDate is after 3pm belongs to the next day.
Then create a sleep duration column which is just the difference between the “startDate” and the “endDate” columns.
Sum over the sleep duration values grouping by the sleep_date_column.
Comparison of results of this methodology with my iPhone’s Health app
My sleep data is initially collected by Oura Ring. Here is the same histogram that compares the sleep durations of Apple Health, Oura Ring and What Sticks – which is the app I am building.
Most days are very close, but you’ll notice to the right of the chart May 29 and 30 are off by about an hour for Apple and Oura. What Sticks is right in the middle. This was not done purposefully. There is clearly some difference in the methodology, but I have yet to figure it out.
Timezones
The major takeaway for me is that Apple Health already accounts for timezone changes.
My initial concern was how the Apple Health sleep records handle timezone differences. When you look at the data in the startDate and endDate columns, you’ll notice all have “+0000” at the end. This suggests that the timezone is UTC/GMT. However, in my experience it has already been converted to my timezone.
Using the same methodology with no changes to timezones, I looked a period where I travelled a lot – crossing many timezones. Figure 3 shows my daily sleep duration differences between the three sleep durations. The columns in gray indicate the days I travelled across timezones.
There are 34 selected days in Figure 3. During seven of these days I spent some time on a plane crossing timezones. I included days surrounding the travel to make sure I didn’t miss any carry over affects from the timezone conversion by either Oura or Apple.
I attempted to use 2 days prior the travel and 4 days after each travel. Unfortunately, my Oura Ring battery died a few times, so I tend to have some days missing directly following my travel. In fact, the only complete travel where I have a complete 4 days before and after the travel was on 2024-01-22. The other periods have at least one day following where some data is missing.
In general, I don’t think these are too far off considering I don’t have the exact methodology for the dataset I am using. The date that sticks out the most is 2024-03-01 in Figure 3. But even this difference is between Oura Ring. Ideally, there would be no differences. However, given the dataset is from Apple it's good to know that Apple and What Sticks have minimal differences given that I don’t have their exact methodology.
Conclusion
The What Sticks sleep duration calculation is not bad considering the Apple Health – Our Ring comparison. There are some apparent differences in methodology but since I do not know what they are. For the time being I will continue with what I have.
The key component of What Sticks is to calculate a Pearson’s Correlation coefficient. To understand how much impact this inaccuracy has on the correlation coefficient, we’ll need to know the distribution of the errors between the What Sticks and the Apple Health sleep durations.
Regardless, I would say the What Sticks sleep duration calculation needs help.
Methodology used to collect data from Oura Ring API
The method I used to get the Oura Ring sleep duration is using the https://api.ouraring.com/v2/usercollection/sleep endpoint.
The response included a dictionary.
In the first level there was the “data” key.
Inside “data”, there was a list of dictionaries.
Each dictionary in this list captured sleep data for a day, however, sometimes there were multiple day entries.
Inside the list of dictionaries (found in “data”), there is a “type” element whose values, in my case, were either: “late_nap”, “long_sleep”, and “sleep”.
I found that if I wanted to match the sleep duration on my Oura Ring App in my iPhone, I needed to sum the “total_sleep_duration”. Specifically, I needed to sum value across “late_nap” and “long_sleep”, omitting the “total_sleep_duration” from types that were “sleep”.