In Nassim Taleb’s bestselling book The Black Swan he says, “[he] completely gave up reading newspapers and watching television.” This freed up one hour or more a day, which he said, “[was] enough time to read more than a hundred additional books per year, which, after a couple of decades, starts mounting.”
This really got me thinking. Taleb made a small change that would allow for hundreds of extra books to be read over a decade. Over the last couple of years, I too have made small changes in my life in order to maximize the number of books I read. As the total amount of books read grows, the more I feel the need to have an organized list of everything I’ve read. I already have a basic spreadsheet with my books, but I want to get more info. The inner accountant in me keeps saying: “add more columns!”
Since January 2018 I have read 226 books, or slightly over 2 books a week.
I wanted to do an exercise where I comb through the book data I have generated to see what I might learn about my book habits.
Before I started, I knew I wanted to find a few things out:
- I am fascinated by book ratings. I’ve tried to create my own ranking system but haven’t been able to create a system that I like. I wanted to look at the Goodreads data to see how my own ratings compared to the average Goodreads ranking for a book.
- I wanted a better understanding of the genres I read. When people ask me what genres of books I like to read, I usually tell them “biographies.” I’m curious if the data says that’s true.
I keep a detailed book log of all the books I read. Recently, I started putting more effort into keeping my Goodreads account accurate and up to date. Goodreads has an export option that allows you to export the books you have read.
I merged my own book log with the Goodreads data and began summarizing the data. I also found this online tool which turns your exported Goodreads data spreadsheet into colorful graphs and charts. There was also a python script I used to gather more information from Goodreads which wasn’t included in the original export.
Up to this point, whenever I give a rating on Goodreads I don’t think much about the ranking. My average ranking is 4.06, which seems a little high to me. Fifty-six of the books (25% percent) were rated a perfect 5, or as Goodreads describes them – “it was amazing.”
I am okay with having a high average rating. I expend considerable effort selecting the books I read. When I read a book I expect it to be good because I have prescreened it. But how could I come up with which books were the real crème de la crème?
The top 10%
One ranking system I tried last year was to use a standard percentage to determine how many “5-star” ratings I could give. Basically, I would limit “5-star” ratings to 10 percent of the total amount of books read. I decided to revisit this concept and determine what the top 20 books of the last 25 months were, knowing full well that one books’ inclusion was another book’s exclusion. To arrive at 20 books, I excluded duplicate readings – I was left with 204 unique books read since 2018.
Before I reveal the list, I want to clarify that I didn’t rate books as 5-stars because of any identifiable condition (e.g., literary superiority, excellent prose, I-can’t-put-this-book-down etc.). Rather, it’s some combination of all of them – to be succinct: these books inspired me.
The List (in alphabetical order):
- A Supposedly Fun Thing I’ll Never Do Again – David Foster Wallace
- Born to Run – Christopher McDougal
- Driven: An Autobiography – Larry H. Miller
- East of Eden – John Steinbeck
- How Will You Measure Your Life? – Claton M. Christensen
- Master of the Senate – Robert A. Caro
- Means of Ascent – Robert A. Caro
- Outliers: The Story of Success – Malcolm Gladwell
- Principles: Life and Work – Ray Dalio
- Sam Walton: Made in America – Sam Walton
- Shoe Dog – Phil Knight
- String Theory: David Foster Wallace on Tennis – David Foster Wallace
- The Brothers Karamazov – Fyodor Dostoyevsky
- The Moon is a Harsh Mistress – Robert A. Heinlein
- The Path to Power – Robert A. Caro
- The Power Broker – Robert A. Caro
- The Remains of the Day – Kazou Ishiguro
- The Sirens of Titan – Kurt Vonnegut
- The Third Policeman – Flann O’Brien
- Tune In (The Beatles: All These Years) – Mark Lewisohn
- Harry Potter and the Goblet of Fire – J.K. Rowling
- Spaceman – Mike Massimino
- “Surely You’re Joking, Mr. Feynman!” – Richard Feynman
Me vs. Goodreads
I thought it would be interesting to look at which books I differed the most from the average Goodreads rating.
The 3 books I rated higher than the average Goodreads rating:
|Title||Author||My Rating||Goodreads Rating|
|Business Adventures||John Brooks||5||3.81|
|The Adventures of Tom Sawyer||Mark Twain||5||3.91|
|Eat and Run||Scott Jurek||5||3.99|
The 3 books I rated lower than the average Goodreads rating:
|Title||Author||My Rating||Goodreads Rating|
|The Long Walk||Richard Bachman (Stephen King)||2||4.11|
|For Whom The Bell Tolls||Ernest Hemingway||2||3.97|
|The Tipping Point||Malcolm Gladwell||2||3.96|
Through this process, I learned that classifying a book’s genre is not an easy task.
I used the tool mentioned above to download my Goodreads genre data. Goodreads uses crowd-sourced information by its users to populate the genres for a book. Users can place books into their own digital bookshelves on Goodreads, and they can call their shelves whatever they’d like. For example, a user might have a “mystery” shelf, which she populates with books she thinks fits her own self-imposed criteria. Goodreads then aggregates people’s shelves and shows the results under the “Genre” section of a book’s page.
As part of my analysis, I was able to bring in the top-5 genres for each book.
I wanted to be able to sort my data by literary type (i.e., fiction or nonfiction) and by the genres of the books. This meant for each book I had to label it as one genre. An almost impossible task. Right away I struggled with what a “business” book is. A lot of the books I’ve read I wanted to classify as a “business” book, but other genres seemed more adequate. For example, take Malcolm Gladwell’s Outliers. Its top genre (other than the generic “Nonfiction” label) is Psychology. But almost as many people rated it as a “Business” book. I didn’t read Outliers because I thought it was a Psychology book, rather, I read it because it seemed like a business book.
Outliers’ Goodreads data:
My top 3 genres (when a book was just classified as one genre) were:
- Biography – 36 books
- Business (so much for not labeling a book as “Business) – 34 books
- Science Fiction – 31 books
Altogether, my top-3 genres comprise 46% of all the books I’ve read since 2018.
One More Approach
I wasn’t that happy with my one-genre-per-book approach. Luckily, I was able to gain more insight from the Bookstats tool mentioned above. I uploaded my Goodreads data onto the website and it spit out the following graph.
The approach this time is to include each of the top-5 genres of a book. These results display a better array of my reading. When considering book’s as multiple genres my top-3 now becomes:
One of the more curious insights I found is that I finish books on Wednesday more than any other day. This is 36% more than Saturday, my next most likely day to finish a book.
Per the graph below, I tend to rank Biography and History books higher than any other genre, and Psychology and Personal Development books the lowest.
Longest Books I’ve Read
I found this chart very interesting. It plots the books I’ve read by the number of pages in the books. As you can see most of the books I’ve read are around 300 pages. I’ve read two books which were over 1,000 pages; The Power Broker (you can read what I wrote about The Power Broker here), and Master of the Senate. Both those books were written by Robert A. Caro. The Longest non-Caro book which I have read is The Snowball: Warren Buffett and the Business of Life.
Pages and Words
Per the online tool, I have read around 81,513 pages, which equates to around 22 million words.
I hope you’ve found all this book data interesting. Hopefully, it makes you want to read more, and to organize what you’ve read!
Going forward I’m going to implement a simple ranking system based on a 10-point scale. I found this reddit post which is about a man who read and ranked over 10,000 books in his life. There were years of his life in which he was reading 2-3 books a day. He logged the books title, date read and a ranking based on a 10-point scale. If you are interested here is a google sheet with his book list.
Write to me: firstname.lastname@example.org