Back

GPT-4o with scheduled tasks (jawbone) is available in beta

96 points8 hourschatgpt.com
imsotiredspacex7 hours ago

This is the prompt describing the function call parameters:

When calling the automation, you need to provide three main parameters: 1. Title (title): A brief descriptive name for the automation. This helps identify it at a glance. For example, "Check for recent news headlines". 2. Prompt (prompt): The detailed instruction or request you want the automation to follow. For example: "Search for the top 10 headlines from multiple sources, ensuring they are published within the last 48 hours, and provide a summary of any recent Russian military strikes in the Lviv Oblast." 3. Schedule (schedule): This uses the iCalendar (iCal) VEVENT format to specify when the automation should run. For example, if you want it to run every day at 8:30 AM, you might provide:

BEGIN:VEVENT RRULE:FREQ=DAILY;BYHOUR=8;BYMINUTE=30;BYSECOND=0 END:VEVENT

Optionally, you can also include: • DTSTART (start time): If you have a specific starting point, you can include it. For example:

BEGIN:VEVENT DTSTART:20250115T083000 RRULE:FREQ=DAILY;BYHOUR=8;BYMINUTE=30;BYSECOND=0 END:VEVENT

In summary, the call typically includes: • title (string): A short name. • prompt (string): What you want the automation to do. • schedule (string): The iCal VEVENT defining when it should run.

ttul7 hours ago

Amazon had an insane number of people working on just the alarms feature in Alexa when they interviewed me for a position years ago. They had entire teams devoted to the tiniest edge case within the realm of scheduling things with Alexa. This is no doubt one of the biggest use cases in computing: getting your computer to tell you what to do at a given time.

qgin7 hours ago

Recurring schedules across time zones is an unbelievably maddening thing to implement. At first glance it seems simple, but it gets very weird very quickly.

wkat42424 hours ago

Yeah summer time in different countries switching on different days and often in a different direction (other hemisphere). I used to work on such matters and those weeks were the toughest.

ethbr11 hour ago

Developers when they first start working with time across timezones: "This is a technical problem."

Developers after more research: "Oh... this is a political problem."

echeese7 hours ago

Considering my iPhone alarm still sometimes fails to go off (it just shows the alarm screen silently), I'd be inclined to believe you.

ineedasername4 hours ago

Thanks for that— I though I was going crazy (well still could be I guess) or had some strange habit or gesture I didn’t realize was silencing the alarm somehow.

yakz4 hours ago

Whenever I have to wake for something that I absolutely can’t miss, I set 2-3 extra reminders 5 minutes apart precisely because of this “silent alarm” bug. It’s only happened to me a couple of times but twice was enough to completely destroy my trust in the alarm. The first time I thought I just did something in my sleep to cause it, but the UI shows it as if the alarm worked. I’m lucky to have the privilege that if I oversleep an hour or so it’s no big deal, otherwise ye olde tabletop alarm clock would be back.

emptiestplace4 hours ago

I love the questioning my sanity before I've completely opened my eyes part. It's like a jump start to my day.

android5214 hours ago

And gmail schedule delivery just won't work if you want to email yourself a month later.

dmadisetti8 hours ago

The beta is inconsistently showing (required a few refreshes to get something to show up), but my limited usage of it showed a plethora of issues:

- Assumed UTC instead of EST. Corrected it and it still continued to bork

- Added random time deltas to my asked times (+2, -10 min).

- Couple notifications didn't go off at all

- The one that did go off didn't provide a push notification.

---

On top of that, only usable without search mode. In search mode, it was totally confused and gave me a Forbes article.

Seems half baked to me.

Doing scheduled research behind the scenes or sending a push notification to my phone would be cool, but surprised they thought this was OK for a public beta.

ineedasername5 hours ago

When I have it do a search I have to tell it to just get all the info it can in the search but wait for the next request. The I explicitly tell it we’re done searching and to treat the next prompt as a new request but using the new info it found.

That’s the only way I get it to have a halfway decent brain after a web search. Something about that mode makes it more like a PR drone version of whatever I asked it to search, repeating things verbatim even when I ask for more specifics in follow-up.

gukov7 hours ago

You'd think Open AI's dev velocity and quality would be off the charts since they live and breathe "AI." If a company building ChatGPT itself often delivers buggy features then it doesn't bode well for this whole 'AI will eat the world' notion.

golergka5 hours ago

So far, I've found AI to be a great force multiplier in green field, small projects. In a huge corporate codebase, it has the power of advanced refactoring (which doesn't touch more than a handful files at a time) and a CSS wizard.

practice94 hours ago

Well none of the labs have good frontend or mobile engineers or even infra engineers

Anthropic is ahead in this because they keep their UIs simplistic so the failure modes are also simple (bad connection)

OpenAI is just pushing half baked stuff to prod and moving on (GPTs, Canvas).

Find it hilarious and sad that o1-pro just times out thinking on very long or image-intense chats. Need to reload page multiple times after it fails to reply and maybe answer will appear (or not? Or in 5 minutes?). Kinda shows they’re not testing enough and “not eating their own food” and feels like chatgpt 3.5 ui before the redesign

lolinder4 hours ago

> Anthropic is ahead in this because they keep their UIs simplistic ... OpenAI is just pushing half baked stuff to prod and moving on (GPTs, Canvas).

What's funny is that OpenAI's Canvas was their attempt to copy Anthropic's Artifacts! So it's not like Anthropic is stagnant and OpenAI is at least shipping, Anthropic is shipping and OpenAI can't even copy them right.

jeffgreco4 hours ago

It's a good point, Anthropic is being VERY choosy and winds up knocking it out of the park with stuff like Artifacts. Meanwhile their MacOS app is junk, but obviously not a priority.

cma3 hours ago

> because they keep their UIs simplistic

How do I edit a sent message in the Claude Android app? It's so simplistic I can't find it.

cruffle_duffle6 hours ago

According to all the magazines I've been reading, all that is required is to just prompt it with "please fix all of these issues" and give it a bulleted list with a single sentence describing each issue. I mean, it's AI powered and therefore much better than overpaid prima-donna engineers, so obviously it should "just work" and all the problems will get fixed. I'm sure most of the bugs were the result of humans meddling in the AI's brilliant output.

Right now, in fact, my understanding is OpenAI is using their current LLM's to write the next generation ones which will far surpass anything a developer can currently do. Obviously we'll need to keep management around to tell these things what to do, but the days of being a paid software engineer are numbered.

xarope5 hours ago

I think you forgot the /s (sarcasm) in your post!

imsotiredspacex7 hours ago

i posted the system prompt part describing the function call; if you read it and adjust your prompt for creating the task it works way better.

potatoman227 hours ago

I'd rather have buggy things now than perfect things in a year.

dmadisetti7 hours ago

Doesn't need to be perfect- but using this would actively reduce productivity

sprobertson7 hours ago

First impressions matter, if the experience is this bad you're probably waiting a year to come back anyway.

jahewson7 hours ago

Worked out great for Sonos when their timers and alarms didn’t work.

broknbottle2 hours ago

Found the PM

arthurcolle8 hours ago

DateTime stuff is generally super annoying to debug. Can't fault them too badly. Adding a scheduler is a key enabling idea for a ton of use cases

sensanaty8 hours ago

> Can't fault them too badly

The same company that touts their super hyper advanced AI tool that can do everyone's (except the C-level's, apparently) jobs to the world can't figure out how to make a functional cron job happen? And we're giving them a pass, despite the bajillions of dollars that M$ and VC is funneling their way?

Quite interesting they wouldn't just throw the "proven to be AGI cause it passes some IQ tests sometimes" tooling at it and be done with it.

arthurcolle6 hours ago

it would explain the bugs if they used the AI to make the datetime implementation though

dmadisetti8 hours ago

Yeah, they're not exactly a scrappy startup- I'd be surprised if they had 0 QA.

Makes me wonder if they internally have "press releases / Q" as an internal metric to keep up the hype.

airstrike5 hours ago

Maybe that's the Q* we've been hearing rumors about

cbeach8 hours ago

Agreed on date/time being a frustrating area of software development.

But wouldn't a company like OpenAI use a tick-based system in this architecture? i.e. there's an event emitter that ticks every second (or maybe minute), and consumers that operate based on these events in realtime? Obviously things get complicated due to the time consumed by inference models, but if OpenAI knows the task upfront it could make an allowance for the inference time?

If the logic is event driven and deterministic, it's easy to test and debug, right?

singron4 hours ago

The original cron was programmed this way, but it has to examine every job every tick to check if it should run, which doesn't scale well. Instead, you predict when the next run for a job will be and insert that into an indexed schedule. Then each tick it checks the front of the schedule in ascending order of timestamps until the remaining jobs are in the future.

This is also a bad case in terms of queueing theory. Looking at Kingmans equation, the arrival variance is very high (a ton of jobs will run at 00:00 and much fewer at 00:01), and the service time also has pretty high variance. That combo will either require high queue delay variance, low utilization (i.e. over-provosioning), or a sophisticated auto-scaler that aggressively starts and stops instances to anticipate the schedule. Most of the time it's ok to let jobs queue since most use cases don't care if a daily or weekly job is 5 minutes late.

sky22243 hours ago

Pretty useless so far. I'm not sure what the intended application of this is so far, but I wanted it to schedule some work for me.

It only scheduled the first thing and that was after having to be specific by saying "7:30pm-11pm". I wanted to say "from now to 11pm" but it did couldn't process "now"

sandspar54 minutes ago

If you find a tool useless then it's likely that you lack imagination.

halamadrid1 hour ago

This is interesting, although I am a little confused about the purpose of ChatGPT with this feature.

We already have many implementations where at a cron interval one could call the GPT APIs for stuff. And its nice to monitor it and see how things are working etc.

So I am curious whats the use case to embed a schedule inside the ChatGPT infrastructure. Seems like a little off its true purpose?

sandspar54 minutes ago

It's for normies.

serjester1 hour ago

This seems like such a strange product decision - why clutter the interface with such a niche use case? I’m trying to imagine OpenAI’s reasoning - a new angle on long term memory maybe? Or a potential interface for their agents?

sandspar53 minutes ago

It's to warm normal people up to the fact that we have agents now.

encoderer2 hours ago

Founder of Cronitor.io here — if you’re a developer considering using this, would it be valuable for you to be able to report in to Cronitor when it runs so we can keep an eye and alert you if your tasks are late, skipped or accidentally deleted?

We support just about every other job platform but I’d love to hear from potential users before I hack something together.

PittleyDunkin8 hours ago

Where are the release notes?

Edit: I suppose they'll be here at some point: https://help.openai.com/en/articles/9624314-model-release-no...

These seem like extremely shitty release notes. I have no clue why anybody pays for this model.

ben_w7 hours ago

You might want this? It's more technical than the one you linked to:

https://platform.openai.com/docs/changelog

throwup2387 hours ago

The docs for the beta seem to already be up: https://help.openai.com/en/articles/10291617-scheduled-tasks...

speedgoose7 hours ago

It has consistently been the best model for the two last years and only Gemini is perhaps slightly better now.

TheJCDenton8 hours ago

Nothing yet

sandspar55 minutes ago

It's a tech demo to get normies used to the idea of agents. HackerNews "20 years in industry" guys are flabbergasted because it defaults to UTC so is therefore totally useless, clearly. Perhaps you live in a bubble?

phgn8 hours ago

What am I supposed to see at the link?

swifthesitation7 hours ago

You click the drop down menu for model selection and choose 4o with scheduled tasks

nycdatasci6 hours ago

Lots of complaints mentioned here. If you have a legitimate need for a product like Tasks that is more fully baked, I’d encourage you to check out lindy.ai (no affiliation). I’ve been using it to send periodic email updates on specific topics and it works flawlessly.

simple107 hours ago

The UI is different in the desktop app for macOS. The ability to edit the schedule task is only available in the web UI for me.

I got the best results by not enabling Search the Web when I was trying to create tasks. It confuses the model. But scheduled tasks can successfully search the web.

It's flaky, but looks promising!

throwaway3141557 hours ago

Less relevant but why isn't canvas available in the desktop app? I thought they had feature parity but it seems not.

reversethread7 hours ago

Does the world need another reminder/todo app?

Many existing apps (like Todoist) have already had LLM integrations for a while now, and have more features like calendars and syncing.

Or do I completely not understand what this product is trying to be?

bogdan3 hours ago

Why not? I already pay for chatgpt but I don't pay for todoist so that doesn't help me.

elyase7 hours ago
cbeach8 hours ago

I'm sure it's brilliant, but I have no idea what it's capable of. What will it do? Send me a push notification? Have an answer waiting for me when I come back to it in a while?

I switched over to the "GPT4o with scheduled tasks" model and there were no UI hints as to how I might use the feature. So I asked it "what you can you follow up later on and how?"

It replied "Could you clarify what specifically you’d like me to follow up on later?"

This is a truly awful way to launch a new product.

sandspar52 minutes ago

Maybe it's effective at hitting a goal which you do not see.

benaduggan8 hours ago

After asking it to schedule something, it prompted me to allow or block notifications, so sounds like this is just chatGPT scheduling push notifications? We'll see!

jerpint8 hours ago

So basically canibalizing Siri ?

1propionyl7 hours ago

Siri has access to a wealth of private existing and future on-device APIs to fuel context sensitive responses to queries on vendor locked devices used all day long. (Which Apple has apparently decided to just not use yet.)

OpenAI doesn't, they just have a ton of funding and (up to recently) a good mass media story, and the best natural language responses.

The moat around Siri is much deeper, and I don't really see any evidence OpenAI has any special sauce that can't be reproduced by others.

My prediction is that OpenAI's reliance on AI doomerism to generate a regulatory moat falters as they become unable to produce step changes in new models, while Apple's efforts despite being halting and incomplete become ubiquitous thanks to market share and access to on device context.

I wouldn't (and don't) put my money in OpenAI anymore. I don't see a future for them beyond being the first mover in an "LLM as a service" space in which they have no moat. On top of that they've managed to absorb the worst of criticism as a sort of LLM lightning rod. Worst of all, it may turn out that off-device isn't even really necessary for most consumer applications in which case they'll start to have to make most of their money on corporate contracts.

Maybe something will change, but right now OpenAI is looking like a bubble company with no guarantee to its dominant position. Because it is what it is: simply the largest pooling of money to try to corner this market. What else do they have?

+1
secfirstmd7 hours ago
siva77 hours ago

Yep, this is a truly bad feature launch. I have no clue what this model does. Did they somehow lose their competent product people?

cbeach8 hours ago

Ah, I've just stumbled on some hints after clicking around.. click on your avatar image (top right) and then click "Tasks"

Then there are some UI hints.

"Remind me of your mom's birthday on [X] date"

Wow, really maximising that $10bn GPU investment!

danpalmer8 hours ago

Glad to see that the thriving 2010 market of TODO list apps will see a resurgence in the AI era.

delichon7 hours ago

A todo app that you can write and modify by editing a natural language prompt, and that can parse inputs from the whole web with flexibility and nuance, is not a small thing.

danpalmer7 hours ago

That also seems to not get timezones right, has a confusing search function...?

More seriously, todo apps are about productivity, not just about becoming a huge bucket of tasks. I've always found that the productivity comes from getting context out of my head and scheduled for the right time. This release appears to be more about that big bag of tasks and less about productivity. I'm all for AI in products, I think it can be powerful, but I've not had a use-case for it in my todo app.

TheJCDenton8 hours ago

There is an editable tasks list and in the settings menu you can choose to receive notifications via push and/or email.

picografix4 hours ago

why are they trying to be a model provider as well as service provider

rlt2 hours ago

Why wouldn’t they? Most big tech cos offer products at multiple layers of the stack.

rfdearborn4 hours ago

These are best understood as scheduled tasks for the AI instead of tasks for the user.

throwaway3141557 hours ago

This is shaping up to be as bad as the Sora release.

krishadi6 hours ago

For those unable to find this, you can find it as a new model in the model drop-down menu.

krishadi6 hours ago

The biggest outcome here is that now the app has memory.

sagarpatil4 hours ago

A glorified reminder? Really?

golergka5 hours ago

Couldn't you do the same with giving an LLM access to your shell and a cron command?

onemoresoop4 hours ago

Would you give an LLM that priviledge?

geepytee7 hours ago

Imagine being an engineer on the Siri team, must be so demoralizing.

zb38 hours ago

The link doesn't work, presumably because I won't pay OpenAI which stole my API credits by making them have an "expiration date".

ldjkfkdsjnv8 hours ago

This is going to eat software, and is the beginning of agents. The orchestrator of these tasks will come, and OpenAI will turn into a general purpose compute system, the endgame of workflow software. Soon there will be a database, and your prompts will be able to directly read and write to an openai hosted postgres instance. And your CRUD app will begin to disappear. Programming will feel pointless

rglover7 hours ago

Possibly, but that's going to require 100% consistent, accurate outputs (tricky as that's not the nature of LLMs).

Otherwise, you'll have a lot of systems dependent on these orchestrators creating hard-to-debug mistakes up and down the pipeline. With software, you can reach a state where it does what you tell it to without having to worry if some model adjustment or API change is going to break the output.

If they solve that, then yes. Otherwise, what I personally expect is a lot of businesses rushing into implementing "agents" only to backpedal later when they start to have negative material effects on bottom lines.

ldjkfkdsjnv7 hours ago

Its inevitable. You can argue about what's possible right now, but I'm not looking at it from that angle. I think these issues will be solved with time

rglover7 hours ago

That belief is at odds with the mechanics of how LLMs work. It's not a question of more effort/investment/compute/whatever, it's just a reality of how the underlying systems work (non-deterministic). If you can find a way to make the context window on the scale of the human brain, you may be able to mostly mitigate this.

People want us to be at "Her" levels of AI, but we're at a far earlier stage. We can fake certain aspects of that (using TTS), but blindly trusting an AI to run everything is going to be a big mistake in the short-term. And in order for the inevitability of what you describe to take place, the predecessor(s) to that have to work in a way that doesn't scare people and businesses away.

The plowing of money and hype into the current forms of AI (not to mention the gaslighting about their ability) makes me think the real inevitability is a meltdown in the next 5-10 years which leads to AI-hesitancy on a mass scale.

+1
ben_w7 hours ago
+1
ldjkfkdsjnv7 hours ago
potatoman227 hours ago

Why? Past progress =/= equal rate of future progress.

ryan937 hours ago

They are using infinity compute and can’t do simple notifications. How will changing the architecture slightly or ingesting more data change that?

worldsayshi8 hours ago

Sure but do they have a moat here? Anyone that can connect to an LLM could make that app.

zb37 hours ago

Yes, they have the name "ChatGPT". For non-technical people this appears to be the most important thing.

nozzlegear7 hours ago

Is it a household name? Anecdotally, only two of my five millennial/gen-z siblings use an AI app at all, and one of them calls her's "Gary" instead of ChatGPT. I'd be interested in seeing some actual data showing how much ChatGPT is an actual household name versus one that us technical people assume is a household name due to its ubiquity in our space.

ben_w7 hours ago

> Is it a household name?

I think it is, yes.

It was interviewed under that name on one of the UK's main news broadcasts almost immediately after it came out. Few hundred million users. Anecdotes about teachers whose students use it to cheat.

But who knows. I was surprising people about the existence of Wikipedia as late as 2004, and Google Translate's augmented reality mode some time around the start of the pandemic.

ldjkfkdsjnv7 hours ago

Does AWS have a moat on cloud computing?

scarface_746 hours ago

Yes, it would take 10s of billions of dollars to recreate the infrastructure as far as servers and AWS has its own pipelines running under the oceans.

Then you have to recreate all of the services on top of the AWS.

Then you have to deal with regulations and certifications.

Then you have to convince decision makers to go against their own interests. “No one ever got fired for Amazon”.

Then you have to convince corporations to spend money to migrate.

daveguy7 hours ago

This significantly overestimates the reliability of LLMs -- both their output integrity and their ability to understand context.

throwaway3141557 hours ago

Bit of advice: you might want to actually use an offering before claiming it is revolutionary.

ldjkfkdsjnv7 hours ago

I've got 15 years of engineering experience, worked on some of the largest distributed systems at FAANG. Its coming

scarface_746 hours ago

> worked on some of the largest distributed systems at FAANG.

As have 10s of thousands of other people who could invert a btree on the whiteboard….

throwaway3141557 hours ago

Oh wow good for you! Didn't realize you were a prodigy or that this was a contest. I take it all back. /s

Maybe try some humility. You're not helping yourself with the bragging about frankly underwhelming and common (here) experience.