> To enhance our service and offer additional content to users, advertisements will be displayed on the Cover Screen for the Weather, Color, and Daily Board themes.
URLs from my pocket archive (~4200 items) were around 85k tokens, assuming a 2k output token, it would cost me 18 cents to run this via API (o3 model) [1].
After reading this I realized I also have an archive of my pocket account (4200 items), so tried the same prompt with o3, gemini 2.5 pro, and opus 4:
- chatgpt UI didn't allow me to submit the input, saying it's too large. Although it was around 80k tokens, less than o3's 200k context size.
- gemini 2.5 pro: worked fine for personality and interest related parts of the profile, but it failed the age range, job role, location, parental status with incorrect perdictions.
- opus 4: nailed it and did a more impressive job, accurately predicted my base city (amsterdam), age range, relationship status, but didn't include anything about if I'm a parent or not.
Both gemini and opus failed in predicting my role, probably understandably. Although I'm a data scientist, I read a lot about software engineering practices because I like writing software and since I don't have the opportunity at work to do this kind of work, I code for personal projects, so I need to learn a lot about system design, etc. Both models thought I'm a software engineer.
Overall it was a nice experiment. Something I noticed is both models mentioned photography as my main hobby, but if they had access to my youtube watch history, they'd confidently say it's tennis. For topics and interests that we usually watch videos rather than reading articles about, would be interesting to combine the youtube watch history with this pocket archive data (although it would be challenging to get that data).
You should be able to use Google Takeout to get all of your YouTube data, including your watch history.
This article is a nice example of someone using it:
> When I downloaded all my YouTube data, I’ve noticed an interesting file included. That file was named watch-history and it contained a list of all the videos I’ve ever watched.
This can give a false sense of what Google (Alphabet) actually knows about you. That above is Google playing the game of 'ok, here is what we know of your activities on youtube when logged in!'
But Google and the rest of the "advertising" (euphemism for surveillance) industry track and create "profiles" based on a basket of data points, from ip/MAC address to the rest of their bag of tricks.
Internally at Google a toy tool to peek into your own personal advertisement profile was released and taken down within a week or two because it was creepy knowledgeable about you.
Yes I've done this in USA. pretty neat. I have it on my todo list to parse over it and find all the music videos I've watched 3 or more times to archive them.
It is available and it can be surprisingly large. I've somehow accumulated multiple GB of data from YT alone. Which feels a bit absurd - there's bound to be lots of waste there.
I believed this, which is what made me avoid computer science in college; I wanted to avoid ruining my favorite hobby.
After a few years post graduation, where I wasn't sure what I wanted to do and I floundered to find a career, I decided to give software development a try, and risk ruining my favorite hobby.
Definitely the best decision I could have made. Now people pay me a lot of money to do the thing I love to do the most... what's not to love? 20 years later, it I still my favorite hobby, and they keep paying me to do it.
I think it heavily depends on who you're working for.
If they get out of the way and let you do the thing you love how you want to do it you'll get good results for you and them.
If they treat you like a cog in a machine and assume they need to carrot and stick you into doing things because you might not really want to be there, you'll be miserable.
Sure, of course. Sometimes it works out to follow your passion into a career. I was objecting to the apparent premise that that’s _always_ what you should do.
My first software job I enjoyed. My 2nd/current job I enjoy everything except the actual work. Too much beuracracy, but it hasn't ruined my love for the craft yet. Oh well, I'm building some other skills I didn't know I had in me.
You need to use an iterative refinement pyramid of prompts. Use a cheap model to condense the majority of the raw data in chunks, then increasingly stronger and more expensive models over increasingly larger sets of those chunks until you are able to reach the level of summarization you desire.
re o3: you can zip the file, upload it, and it will use python and grep and the shell to inspect it. I have yet to try using it with a sqlite db, but that's how i do things locally with agents.
Author mentions that by doing that they didn't get a high quality response. Adding the texts into model's context make all the information available for it to use.
I think a reasoning/thinking-heavy model would do better at piecing together the various data points than an agentic model. Would be interested to see how o3 does with the context summarized.
This really hurts. All my content consumption workflow depends on Pocket, I shortlist from my rss reader (inoreader) directly into pocket, then ot gets synced with readwise reader automatically where I listen to the chosen articles [1]. I also use my pocket account's archive as a collection of the articles and writings I have liked (planning to build a personal and simple content recommendation system for myself).
[1] https://saeedesmaili.com/posts/my-content-consumption-workfl...
Not sure yet. Since I'm already paying for readwise, I might use that, but then sharing from inoreader to readwise will be multiple steps instead of the current built in pocket integration I use. I should also make sure I'm keeping a local archive of read and unread stuff just in case.
I'll also contact inoreader to see if they will replace their built-in pocket integration with anything else.
I've been using Inoreader for a few years now and I'm pretty happy with it. Its reliability and feature set is the right balance for me. I've written about its pros and cons [1], the main pros for me are:
- Very smooth experience between web, android, and iOS apps (I’m mentioning this first, as many other apps I’ve tried are flaky)
- Mark as read while scrolling (Very useful for quickly shortlisting items from the feed. This is probably the main reason I’ve been able to replace Inoreader with social media apps.)
- Rules to auto-delete duplicated items or if the title contains specific words.
> To enhance our service and offer additional content to users, advertisements will be displayed on the Cover Screen for the Weather, Color, and Daily Board themes.