So, by now you must have heard about OpenAI’s ChatGPT 4o(mni). If not you should definitely find a youtube video when they demonstrated it. The OpenAI demo was rather rushed tho. Almost as if they just found out Google was going to announce some new AI features and they wanted to steal their thunder the night before…

Nevertheless, it is an impressive demo. Heck, it got me to renew and fork over twenty bucks to try and get a chance at the new features earlier than general users. One of the podcasts I listen to commented that after seeing this demo, they declared the Rabbit R1 dead. But I don’t think there’s a strong relation between the capabilities OpenAI demo’ed and what the R1 represents. If I understand correctly, 4omni is a natively multi-modal LLM, and has been trained or more than just text, but rather images, music, video, documents and such. The Rabbit R1 is an agent which can take fairly independent action on your behalf. You give it a command to do something, it does some strategizing and planning of steps to follow and then begins to act on your behalf. I tried out another agent in the form of a browser plugin which was able to look into my email and calendar and maps and online accounts to perform tasks that I asked it to do. This was eye-opening for me. But it did not seem to correlate to what 4omni was demonstrating. 4o didn’t seem to take action on my behalf such as make dinner reservations based on certain criteria. As for an AI agent, the deal with Apple and OpenAI is really interesting to me. Everyone complains about Siri. What if Siri was replaced with ChatGPT (not Sky) and also had some guardrailed ability to perform actions on your behalf using the access it has to apps on your iPhone? This could be interesting.

The other player here is Google with Android and Google Assistant. Google Assistant was introduced almost a decade ago and when it first came out, I was a big fan. I could ask it questions from my watch or my earphones. I could receive and reply to text messages with my headphones without taking out my phone. It was connected to my home and I could turn on my air conditioner when I was a certain distance away from home.

But these days, Gemini has not been making a very good show of itself. The most recent gaffe is the Generative Search Experience telling people the daily amount of rocks to eat or pizza recipes with glue to prevent sauce falling off. The trust has been eroded. If we can’t trust Gemini to return safe responses via RAG (retrieval augmented generation) there’s no way people would trust giving it agent-capabilities and access to their phone apps and data. Apple on the other hand has a lot more trust from its users. (Let’s not talk about the commercial where they squished fine art and musical instruments…) So, I see Apple in a better position to release this type of agent-assistant.

This reminds me of what my teachers have always taught us since elementary school – it’s about quality not quantity. In the case of AI model training, it has to be both. Peter Norvig and others have emphasized the importance of large training data sets. Now it looks like it’s not just what amount of training data, but intelligently feeding it to the LLM and having it recognize sarcasm and trolling. Haven’t we learned anything from Microsoft’s Tay?

I think I need to take a break from podcasts. I find that every interim moment I have, I pop in my earbuds and listen to really interesting podcasts. it seems to take a way that boring space where i’m forced to just stare at the subway ceiling. But I’m starting to feel like that interim space of having nothing to consume is kinda like sleep. Some say that sleep is when your mind organizes the thoughts and experiences you’ve had during the day and helps make better sense and orientation and connections for them. I kinda feel like interim space might be like that as well. Some people listen to podcasts at 2x speed to consume and learn as much as possible. I think for a while I need my podcasts to go at O x speed.

Oh and speaking of sleep, here’s a pretty good podcast episode on it – open.spotify.com/episode/3…