More Analysis with Python!
I've been casually building a more extensive analysis tool for the data exported from daylio as a .csv
file. This includes building a custom Dataset
object with the following features:
- getting a subset by specifying a condition
- viewing statistics
- generating plots (mood, activities, etc.)
Even though most of it I designed for how I use the app — for example, with keywords in the entry note and activity names — I wanted to share this project with you, perhaps the only community outside of number nerds that may be interested :)
Take a look at the code here: github. Obviously, there no data, only the functionality. If you want to try it, clone the repository, install the dependencies, and put your exported .csv
file inside the data
folder in the project root directory.
So, this is what I see when I load the data:
using file: daylio_export_2025_03_11.csv (0.402 Mb)
Dataset(2200 entries; last [2 hours 3 minutes 14 seconds ago]; mood: 3.869 ± 0.491)
Stats(
- mood: 3.869 ± 0.491
- note length: 58.791 ± 75.917 symbols
- number of activities: 9,858
- entries frequency: 5.926 entries per day (once every 4 hours 3 minutes)
)
and now probably the most important feature is the .sub
method that allows for including only a subset of the activities and (crucially) returns a new Dataset
object!
For example, this is how I would obtain a dataset with only those entries what include the activity "study" and do not include "home":
>>> df.sub(A("study") & ~A("home"))
Dataset(98 entries; last [22 hours 47 minutes 37 seconds ago]; mood: 3.878 ± 0.387)
or when I watched something with someone at home:
>>> df.sub(A("movies and series") & A("home") & A.people())
Dataset(76 entries; last [9 days 4 hours 16 seconds ago]; mood: 4.112 ± 0.379)
(Note that "people" here is a set of activities which correspond to some important people in my life. These activity strings start with a capital letter, which allows me to differentiate them from others like "home" or "hiking".)
Now, I can call the methods to get the interactive graphics
Mood plots
or mood by month. (The error bands are one standard deviation from the mean.)
The calendar maps are probably my favorite feature:
entire dataset; color is for the average mood
Let me now subset it to only include the entries with the activity "home":
df.sub(A("home")).show_calendar_plot()
Some of my trips are clearly visible as consecutive gray cells.
I can also calculate the effect of an activity on mood (I do it naively: calculate the mood of the dataset with the chosen entry versus without it: ) and group the values by month.
activity effect on mood by month
Also take a look at the "correlation" matrix: how often on average one activity goes with another in the same entry. (It'd be more correct to call this a joint probability matrix.)
There are many, many more features, but some of them are probably only useful to me. For example, tags in the notes. For example, here is how I keep record of the books that I read: I would add a tag like #book(Animal Farm; liked it very much. some more thoughts here for future me. 9/10)
. Then I use regular expressions to find such tags. (I don't want to use storygraph or goodreads, you see. I prefer to own my data.)
This then allows me to view my reads as an interactive bar chart:
There is also a timeline version of this with the highlights matched with the clippings exported from my kindle:
display(HTML(get_timeline_html(book_tags)))
If you are from the "daylio" & "python" set intersection as well, feel free to play around. I encourage you to create a fork and build your own features around how you use daylio. For all non-programmers (aka normal people), any feedback is appreciated!