Next.js App Router + React Server Components Demo

new
past
show
ask
show
jobs
submit

252 points by armcat 16 hours ago | 161 comments

lateforwork 15 hours ago [-]

This looks bad for Microsoft. They added a Copilot button to all their products but it doesn't do much more than open a chat side panel.

I recently tried Claude Cowork for PowerPoint and I was stunned by the content as well as design quality of the deck it produced. That's a threat for Microsoft because now you don't need the editing tools of PowerPoint, AI replaces it, so all you need is the presentation mode of PowerPoint.

Copilot for Excel is useless. Ask it what is in cell A1 and it can't answer. I am looking forward to trying ChatGPT for Excel.

giancarlostoro 20 minutes ago [-]

I am still surprised that outside of open source AI models, Microsoft is just routing to external models, to a degree its kind of smart because they don't have to have all the skin in the game for the infrastructure, plus they sell some of the hosting anyway, but man. Why does Microsoft not have a frontier model yet? Would have been a great time any time in the last few years to introduce a real Cortana AI model.

nsiemsen 14 hours ago [-]

Claude for excel is already amazing. Fully capable of doing junior work. Formatting is great. Can refactor large multi-tab spreadsheets. It just burns tokens. If OpenAI is going to subsidize this on the monthly enterprise plans for a while then it's a game changer.

Claude for Excel (I work in finance) was one of the absolutely critical reasons we added Anthropic enterprise licenses. But they've turned out to be quite expensive ($100/day for heavy users). We'll see what OpenAI's quotas are.

wouldbecouldbe 5 hours ago [-]

I work with large files a lot, running claude code on it is not token intense at all. Probably because it does a lot with scripts. But its a bit more raw, but i think in the end more powerful. Have to pick a good excel library and language. I do node, maybe python can work as well

intended 8 hours ago [-]

How’s that been in practice ? From what I’ve been following - Claude in finance results in models with errors that an analyst won’t make.

You get models that are formatted and structured and which balance - but there are errors introduced which an analyst / human wouldn’t make.

Stuff like hard coded values, or incorrect cell logic which guarantees the model balances.

balderdash 2 hours ago [-]

Just my experience, it’s not a solution but rather a productivity tool. I mostly use it for tasks I can do myself but it would probably take 20-30min to dial in - now Claude can do it in 2-3min. (E.g. in a data table - add a new column that checks column a if the data is a, do x, if the data is b, do y, if the data is c, do z - then combine that with the word after the hyphen in column b —- or another example —- create a new sheet that is the same format as sheet one but show calculates the difference between column a and b bot for sheets 1-12 in a summary)

I don’t get good results when I just have Claude build things on its own - but for these types of specific productivity tasks I can save a couple of hours here and there.

mukmuk 4 hours ago [-]

From my experience, LLM performance in these areas is being massively oversold. I have repeatedly tried using Claude to modify a range of models typical of investment banking / private equity / sellside research contexts, and the results have been generally disastrous. On multiple occasions, the xlsx would no longer open.

p_ing 14 hours ago [-]

Cheaper to get M365 Copilot licenses for the Claude models in Excel.

jxmesth 4 hours ago [-]

I tried looking this up but wasn't able to find info on this on Microsoft's website. Do you have a link for this?

p_ing 13 minutes ago [-]

https://support.microsoft.com/en-us/topic/choose-your-model-...

WillAdams 12 hours ago [-]

What are the costs on that?

Does this remove (or at least increase) the upload limit?

p_ing 11 hours ago [-]

$200-something per user per year. Will vary based on license type and seat count.

No limits.

mastermage 6 hours ago [-]

Well other than the limits of Copilots usefullness.

croes 8 hours ago [-]

> No limits.

Yet.

evanjrowley 14 hours ago [-]

There is a significant difference in experience between Copilot Basic for a M365 user whose IT admins have blocked integration capabilities with Sharepoint content vs Copilot Premium for a M365 user whose IT admins have allowed integration capabilities with Sharepoint content.

interroboink 14 hours ago [-]

A recent funny story on this topic: https://idiallo.com/blog/what-is-copilot-exactly

HN discussion: https://news.ycombinator.com/item?id=47603231

mohamedkoubaa 14 hours ago [-]

Microsoft is better off not allowing copilot basic because of the reputational harm it will do. Not that they are thinking through copilot rationally

basch 12 hours ago [-]

it was a good name when chosen. too bad they have burned bob, clippy, cortana, sydney, and copilot already.

compass_copium 21 minutes ago [-]

Don't forget Tay!

alternatex 6 hours ago [-]

The backend of Copilot is still called Sydney AFAIK

14 hours ago [-]

LuxBennu 14 hours ago [-]

Chatgpt for Excel is still an office add-in running in the same sandbox though. strongpigeon described the exact bottleneck upthread, process boundary crossings, context.sync() roundtrips that take seconds on web. That's a platform limitation, not a model limitation. Swapping AI behind the add-in doesn't fix the fundamental constraint that third-party add-ins can't deeply integrate with Excel's runtime the way a native feature can. If copilot is bad despite having more access to excel internals(I don't like how Copilot is designed or implemented tho), an add-in with less access is likely not be better.

angadsg 13 hours ago [-]

Would love for you to try both copilot and ChatGPT for Excel. Agreed on the limitations - but in our experience, ChatGPT for Excel does really well on complex sheets.

com2kid 14 hours ago [-]

There is an irony here that this would be more performant with a 2002 coding model. A native plugin, COM, OLE, whatever. C++, crash prone, but fast.

strongpigeon 12 hours ago [-]

Maybe but not drastically so. My guess is that most of the slowness comes from the tool calls round tripping+processing on Anthropic/OpenAI’s servers rather than the app latency.

That’s without talking about the poor UI and security story of COM add-ins and the inability to run on Excel for iOS.

phyalow 6 hours ago [-]

Try writing your own comments rather than posting AI slop

Gareth321 5 hours ago [-]

> They added a Copilot button to all their products but it doesn't do much more than open a chat side panel.

I was hyped when I heard about Copilot. "I can tell it to make pivot tables now!" When I tried to use it I was shocked how underbaked it was. Below even my worst expectations. This really was someone shoving ChatGPT into Excel with almost zero additional effort. Copilot can't DO anything useful.

chris_money202 4 hours ago [-]

stride.microsoft.com -> this is a virtual machine instance with developer tools that allow for same sort of work Claude cowork does. Copilot in excel has to access the excel document through excel provided APIs and can’t completely redo the document like cowork does everytime running developer scripts to generate it because the document instance is open. The model of work is entirely different.

screye 15 hours ago [-]

If AI winning means that data center companies win out, then the wins for Azure will more than make up for the death of Office.

I am surprised that Microsoft's own copilot product is so far behind though.

boringg 18 minutes ago [-]

Aren't they providing a wrapper for the work of another company? IE msft isn't actually doing any foundational work thus they can't meaningfully move product capability, just wait for the model to improve and integrate it?

ryanjshaw 5 hours ago [-]

There’s a magic button you have to press to make it integrate fully. Everybody is confused about why this isn’t the default behavior.

vipipiccf 7 hours ago [-]

I've had the same experience. Copilot for Excel can't even parse basic cell references. Meanwhile Claude handles document formatting in one pass. The catch is it works externally, not inside the app, but at least it works.

The MCP ecosystem is what makes this interesting. Claude isn't just a chat panel bolted onto existing software, it's building integrations that actually manipulate the files. Microsoft had the distribution advantage but they're losing on capability.

chris_money202 4 hours ago [-]

stride.microsoft.com -> microsoft has this stuff you just don’t know it unless you are an M365 power user

compass_copium 15 minutes ago [-]

I would consider myself an M365 power user and I was not aware of this. It is not well promoted--and after all the Copilot crap, I would be annoyed even if it was.

Regardless, I just tried to log in with my work MS account, and I can't do so.

vessenes 13 hours ago [-]

Microsoft has rights to all this IP. So, it might look bad for their product folks, but for the corporation this is great, to the extent it works.

7 hours ago [-]

sarreph 5 hours ago [-]

> This looks bad for Microsoft.

Maybe(?) from a product catalogue perspective... But from a strategic perspective less so because they own ~27% of OpenAI.[0]

[0] - https://openai.com/index/next-chapter-of-microsoft-openai-pa...

xeyownt 3 hours ago [-]

it would be bad for Microsoft if that would use Calc on LibreOffice.

ebbi 15 hours ago [-]

We have many people in my wider team (Finance) that are AI skeptics purely because of their experience with Copilot. Like they don't know what AI is actually capable of when outside of the shackles of Copilot.

Microsoft fumbled so badly here.

watsonL1F7 6 hours ago [-]

[dead]

14 hours ago [-]

chris_money202 4 hours ago [-]

stride.microsoft.com is the cowork equivalent I believe.

p_ing 9 minutes ago [-]

Only for personal accounts. Enterprise customers have a Frontier agent called Copilot Cowork via the M365 Copilot app.... copilot.

Handy-Man 14 hours ago [-]

You have to use the "agent" toggle for Copilot to behave the same way lol. Otherwise its pretty simple chat interface with the context, that's all.

bwat49 15 hours ago [-]

its baffling how badly microsoft has handled copilot, this is exactly what copilot in office should have been

miohtama 15 hours ago [-]

It's called Microslop for a reason.

d3Xt3r 14 hours ago [-]

> I recently tried Claude Cowork for PowerPoint and I was stunned by the content as well as design quality of the deck it produced. That's a threat for Microsoft because now you don't need the editing tools of PowerPoint, AI replaces it, so all you need is the presentation mode of PowerPoint.

Actually, someone here posted a Claude Code skill recently that generates a presentation as a self-contained HTML5 file, so all you need is a browser.

PowerPoint, as a whole, is doomed.

hgoel 12 hours ago [-]

Powerpoint will continue to persist because other people need to be able to edit your slide deck without understanding your HTML.

My employer blocks office plugins, so I can't try Claude for PowerPoint, but sometimes I get Claude to generate Python scripts, which produce PowerPoint slides via python-pptx. This also benefits from being able to easily read and generate figures from raw data.

I don't really like the way Claude tends to format slides (too much marketing speak and flowcharts), but it has good ideas often enough that it's still worth it to me. So I treat this as a starting point and replace the bad parts.

usrme 1 hours ago [-]

I'd love to get a link to that comment/post!

basch 12 hours ago [-]

Or you could just talk to powerpoint, which creates a self contained pptx, which also plays anywhere.

we've hit this point where its cool to have claude reinvent every wheel just because it can.

d3Xt3r 7 hours ago [-]

It's not self-contained, it requires PowerPoint to be indfled. Which is not an issue on corporate machines of course, but maybe you want to do a presentation for a general/broader audience.

alternatex 6 hours ago [-]

Office, or rather Microsoft 365 applications have had web versions for a decade now.

d3Xt3r 6 hours ago [-]

That's besides the point though. With a self-contained HTML, you don't need to go to a special website, you don't need an account or sign-in, heck you don't even need the Internet, and it works pretty much on every device that supports HTML5.

jason_zig 14 hours ago [-]

I'm not sure that's true - try getting someone to pull up an html5 file on their computer for a presentation...

DrSAR 13 hours ago [-]

hrm, double-click and your browser does the rest.

For added benefit, full screen?

Until you need presenter notes or other niceties, this covers a large space of usage.

raincole 14 hours ago [-]

You mean like, double-click?

apsurd 14 hours ago [-]

you must never have actually done this. it doesn't work the way you think it does. unless it's self contained (like a pp), you can't expect network access to actually deliver when you need it most.

d3Xt3r 12 hours ago [-]

The file the Claude skill spits out is actually fully self-contained, no network access is needed.

apsurd 11 hours ago [-]

that's pretty cool!

apsurd 14 hours ago [-]

you could do that for the past 20 years. i've always hated slides as a medium for anything, but i've been proven wrong tine and again that people love their pp.

bad_haircut72 9 hours ago [-]

Because it was drag and drop interface. This existed for HTML but because web pages got too complicated, so did the WYSIWYGs. By just being a program to show slides, the editing experience was manageable for anyone. But if you can hust type what you want to happen into claude, editng experience doesnt matter as much/at all

arjie 10 hours ago [-]

I’ve always found it unbelievable how bad Gemini’s Google Sheets interaction is. Copying the sheets into Claude and then modifying them there and copying them back actually outperforms it.

Nowadays I just make single-purpose websites with Claude Code because Google Sheets has such poor AI integration and is outrageously tedious to edit.

They had all the parts and I have a subscription and it still does terrible things like prompt me to use pandas after exporting as a CSV. It will mention some cell and then can’t read it. It can’t edit tables so they just get overwritten with other tables it generates.

It reminds me of something a friend told me: he heard that Google employees do dogfood their products; some even multiple times every year. There’s no way anyone internal uses Sheets even that often.

bdcravens 5 minutes ago [-]

I tried it the other day to work on some exported CSVs when doing my taxes. I was finally able to get it to do what I wanted, but it was definitely an exercise, feeling like I was talking to Chat GPT from a couple of years ago. (as in a really smart but easily distracted and confused child)

charlieflowers 9 hours ago [-]

I'm having great luck having Claude Code generate, read, and update spreadsheets by writing Python code that uses gspread.

speleding 2 hours ago [-]

It also works fine with Ruby and the "caxlsx" gem. Codex works fine with it as also.

VadimPR 8 hours ago [-]

Can it work with comments in sheets as well? When I looked into it, that seemed like a limitation.

yabutlivnWoods 9 hours ago [-]

My local models interact with Sheets exclusively over the API with Python scripts I been curating for years

Given how well the API works, that we are discussing Googlers, my guess is that's how they dog food their services. Programmers don't get hired by Google for mouse skills.

The GUI is for spot checking results, final presentation.

If you're sitting there point-n-clicking everything into place perhaps consider you are doing it wrong.

beepdyboop 9 hours ago [-]

That sounds like an extremely narrow use case, compared to what the vast majority of Sheets users will be comfortable with

mbreese 8 hours ago [-]

At the same time, it makes some sense... the programmers for a system aren't always the best users of a system. So if you're expecting them to dogfood their own system (Google Sheets), you might find that they test/interact with the system primarily through the API and not the GUI.

I have no idea if they do or not, but it's a plausible explanation...

yabutlivnWoods 8 hours ago [-]

Use case feels like the wrong term.

Do you mean restricted workflow? Googles APIs are pretty much 1:1 to the GUI

And using Python makes it trivial to copy-paste out of files and other APIs with one run of Python

Versus all the fiddling in browser tabs with a mouse, it actually affords an incredibly wide set of options to quickly collate and format data

intended 8 hours ago [-]

How? This argument would make sense if sheets wasn’t targeted at a general audience.

dminik 4 hours ago [-]

Yeah, the Sheets integration is weird. It's usually ok when it wants to place something down the first time. But then it seems incapable of making any changes to it. Or even acknowledging the data in the sheet. What's up with that?

buccal 9 hours ago [-]

You should try MS Copilot which uses open source Python libraries to interact with Office file formats.

The libraries themselves are OK, but MS uses them stupidly. If you want to fill out some form in DOCX or XSLX format you will get broken formatting. And this is from Office company.

darkwater 7 hours ago [-]

Obviously. Because they didn't train the model on proprietary MS code. Which is bad but also good in some way, as it might force MS to support better their formats in the open source world.

devmor 9 hours ago [-]

I recently experimented with trying to generate a passable slide deck from a script and outline I had written beforehand. The ChatGPT integration built into Powerpoint was abysmally bad. Like to the point it was embarrassing as a product.

Claude one-shot something with a Python script that was pretty okay.

AznHisoka 9 hours ago [-]

I love Sheets, but I dont care for using Gemini to interact with Sheets. It seems like a recipe for disaster. Do I really want it to muck around with thousands of rows and no intuitive way to diff its changes? Nope, sticking with basic Sheets

killerdhmo 9 hours ago [-]

I mean, you're wrong. As a Xoogler, everything was in Sheets. Our roadmap was in Sheets. It's more they don't care.

strongpigeon 15 hours ago [-]

Oh wow, I used to work on Excel Add-Ins about 10 years ago. Even got a patent for it. I'd be curious to see how they implemented the calls.

We came up with what I still consider a pretty cool batch-rpc mechanism under the hood so that you wouldn't have to cross the process boundary on every OM calls (which is especially costly on Excel Web). I remember fighting so hard to have it be called `context.sync()` instead of `context.executeAsync()`...

That being said, done poorly it can be slow as the round-trip time on web can be on the order of seconds (at least back then).

Acmeon 14 hours ago [-]

Do you mean that you worked on the Excel Add-Ins platform in Excel (and not on a specific Add-In)?

If you were working on the platform itself, then I would be interested in hearing your more detailed thoughts on the matters you mentioned (especially since I am developing an open source Excel Add-In Webcellar (https://github.com/Acmeon/Webcellar)).

What do you mean with a "OM" call? And why are they especially costly on Excel web (currently my add-in is only developed for desktop Excel, but I might consider adding support for Excel web in the future)?

In any case, `context.sync()` is much better than `context.executeAsync()`.

strongpigeon 14 hours ago [-]

I worked on the Excel Add-Ins platform at Microsoft, yes. By OM call I mean "Object Model" call, basically interacting with the Excel document.

The reason those calls are expensive on Excel Web is that you're running your add-in in the browser, so every `.sync()` call has to go all the way to the server and back in order to see any changes. If you're doing those calls in a loop, you're looking at 500ms to 2-3s latency for every call (that was back then, it might be better now). On the desktop app it's not as bad since the add-in and the Excel process are on the same machine so what you're paying is mostly serialization costs.

Happy to answer more questions, though I left MSFT in 2017 so some things might have changed since.

Acmeon 14 hours ago [-]

Yeah, that makes sense. For some reason, I was under the impression that all calculations run locally in the browser, which would have been comparable to how Excel desktop works (i.e., local calculations). Is there a reason for why the Excel calculations run on the server (e.g., excessive workload of a browser implementation, proprietary code, difficult to implement in JavaScript, cross browser compatibility issues, etc.)? Furthermore, if the reason for this architecture is (or was) limitations in JavaScript or browsers, do you find it plausible that the Excel calculations will some day be implemented in Webassembly?

Regardless, I have always preferred Excel desktop over Excel web (and other web based spreadsheet alternatives). This information makes me somewhat less interested in Excel web. Nonetheless, I find Excel Add-Ins useful, primarily because they bring the capabilities of JavaScript to Excel.

strongpigeon 13 hours ago [-]

I don’t think Excel web will ever be running the calc engine browser side, no. The only way I could see this happen would be via compiling the core to wasm, which I don’t think is worth the engineering effort.

Excel has this legacy (but extremely powerful) core with very few people left that knows all of it. It has legacy bugs preserved for compatibility reasons as whole businesses are ran on spreadsheet that break if the bug is fixed (I’m not exaggerating). The view code for xldesktop is not layered particularly well either leading to a lot of dependencies on Win32 in xlshared (at least back then).

Is it doable? I’m sure. But the benefits are probably not worth the cost.

Acmeon 13 hours ago [-]

Thanks for the interesting info! Yeah, maybe Excel web will someday support local calculations via wasm, but for now I think I will stick with Excel desktop with add-ins.

com2kid 14 hours ago [-]

Does Excel for Web still spin up an actual copy of Excel.exe on a machine somewhere? I heard that is how the initial version worked.

strongpigeon 13 hours ago [-]

No, as the other comment mentioned. But I’ve heard of more than a few customers running their own “server excel workflow” where they have an instances of excel.exe running a VBA macro that talks to a web server (and does some processing).

p_ing 13 hours ago [-]

Never did this. WAC was the original version (integrated with SharePoint Server). Everything was server-side.

com2kid 11 hours ago [-]

While working at MS I remember someone in the office team saying that the original version of Excel online spun to the actual Excel backend and had it output HTML instead of the usual win32 UI. Was I misinformed by chance?

p_ing 42 minutes ago [-]

Excel Online was a component of WAC. It was an ASP.NET (and C++???) web application that used OAuth between SharePoint Server and Exchange Server.

So I mean yes, you viewed Excel docs through a webpage just like you do today via ODSP or OneDrive consumer. The backend is completely different in the cloud service, though.

strongpigeon 13 hours ago [-]

> WAC

Now that’s an acronym that I had forgotten about.

DaiPlusPlus 14 hours ago [-]

> though I left MSFT in 2017 so some things might have changed since.

Honestly, I struggle to think about what has actually changed between Office 2013 and Office 2024 (and their Office 365 equivalents); I know the LAMBDA function was a big deal, but they made the UI objectively worse by wasting screen-space with ever-increasingly phatter non-touch UI elements; and the Python announcement was huge... before deflating like a popped party balloon when we learned how horribly compromised it was.

...but other than that, Excel remains exactly as frustrating to use for even simple tasks - like parsing a date string - today just as it was 15 years ago[1].

[1]: https://stackoverflow.com/questions/4896116/parsing-an-iso86...

angadsg 14 hours ago [-]

Hi everyone, engineer on ChatGPT for Excel here - we launched ChatGPT for Excel to bring the power of GPT-5.4 to Excel. Keen to hear feedback and happy to answer any questions!

bsenftner 1 hours ago [-]

I've had a spreadsheet integrated with ChatGPT API for a few years already. It really was not until GPT-5.4 that the models were able to actually be useful.

What is the data model that you use for the spreadsheet itself? I found I could create a chat completion persona that believed it is one of the developers of a popular open source spreadsheet, and I put this "agent" directly inside the open source spreadsheet. I did this before tool calling was available at all, so I made my own system for that, and the "tools" are the API of that open source spreadsheet. My agent(s) that operate like this can do anything the spreadsheet can do, including operate the spreadsheet engine from the inside.

carderne 7 hours ago [-]

What API/approach does it use to edit sheets?

I made a CLI (+skill) so agents could edit files with verbs like `insert A1:A3 '[1,2,3]'`, but did some evals and found it underperformed Anthropic's approach (just write Python).

rahimnathwani 11 hours ago [-]

How well does this work compared with using GPT-5.4 in Nicopreme’s Pi for Excel?

howdareme9 4 hours ago [-]

have you got a link to this?

rahimnathwani 39 minutes ago [-]

Sorry, I got the author wrong.

It's here: https://github.com/tmustier/pi-for-excel

e38383 8 hours ago [-]

I probably could find some really useful things for it to help me … but all software nowadays only works outside my earth region :(

This time even for pro.

TrackerFF 15 hours ago [-]

I've experimented with ChatGPT for spreadsheets the past 6 months, and while the results look nice now it has been excruciatingly slow for even the simplest spreadsheet. I'm talking 15-20 minutes to make some pretty basic calculator with graphs. IIRC, it used a lot of time purely on the styling.

angadsg 13 hours ago [-]

Engineer on ChatGPT for Excel here. Useful feedback. We have improved the latency inside the add-in a lot and a lot more to come. We also have the Fast, Standard and Heavy thinking modes, where you can adjust the thinking time depending on the task complexity. Curious to hear your feedback once you try this out!

jannyfer 15 hours ago [-]

Adding a tangential anecdote.

I asked GPT-5.4 High to draw up an architecture diagram in SVG and left it running. It took over an hour to generate something and had some spacing wrong, things overlapping, etc. I thought it was stuck, but it actually came back with the output.

Then I asked it to make it with HTML and CSS instead, and it made a better output in five seconds (no arrows/lines though).

SVG looks similar to the XML format of spreadsheets. I wonder if LLMs struggle with that?

bob1029 14 hours ago [-]

The LLMs seem to struggle at anything that isn't relatively well anchored in whatever space. HTML documents have a lot of foundation to them in the training data, so they seem to perform well by comparison to other things.

I just spent a few hours trying to get GPT5.4 to write strict, git compatible patches and concluded this is a huge waste of time. It's a lot easier and more stable to do simple find/replace or overwrite the whole file each time. Same story in places like Unity or Blender. The ability to coordinate things in 3d is really bad still. You can get clean output using parametric scenes, but that's about it.

jqbd 7 hours ago [-]

Parametric scenes is the whole of Houdini and any node based compositor etc. so there is some applications no?

scronkfinkle 15 hours ago [-]

Claude's diagramming tool that they have built into their web UI is my goto for this task. It's reliable enough that I often will delegate to it first with what I need written in prose instead of using mermaid/lucid diagram

brett-jackson 14 hours ago [-]

I’d try asking it for a mermaid diagram. I think ChatGPT’s web interface will render them.

cubefox 14 hours ago [-]

Gemini is very good with SVG, but I don't really see the similarity to spreadsheets.

chux52 46 minutes ago [-]

How do the OpenAI models/reasoning effort map to Fast, Standard, Heavy in the add-in?

flybrand 15 hours ago [-]

Several months ago, ChatGPT swore to me it had interoperability with both excel and Google Sheets. I spent 90 minutes thinking I was an idiot, trying to follow its guidance before asking the internet.

kbos87 2 hours ago [-]

From time to time I've tried using ChatGPT for financial modeling, and I have to say my experiences don't inspire much confidence.

Just this past week I used it to generate a simple model of a few different scenarios related to an investment property I own.

The first problem I ran into is that it was unable to output a downloadable XLS file. Not a huge deal - it suggested generating CSV tables I could copy/paste into a spreadsheet. The outputs it gave me included commas in a handful of numbers over 1,000 (but not all of them!) which of course shifted cells around when brought into Google Sheets. We pivoted our approach to TSV and solved this problem. Big deal? No. Seemingly basic oversight? Absolutely.

This is where the real fun began. Once I started to scrutinize and understand the model it built, I found incorrect references buried all over the place, some of which would have been extremely hard to spot. Here's my actual exchange with ChatGPT:

- - - - - - - - - -

> Can you check the reference in cell F3? It looks like it's calling back to the wrong cell on the inputs tab. Are there similarly incorrect references elsewhere?

> Yes, F3 is incorrect, and there are multiple other incorrect references elsewhere: (It listed about 30 bulleted incorrect references)

Bottom line - - Many formulas point to the wrong Inputs row because of the blank lines - The Sell + Condo section also has a structural design problem, not just bad references.

The cleanest fix is for me to regenerate the entire AnnualModel TSV with: - all references corrected - all 15 years included - the condo scenario modeled properly with a separate housing asset column

- - - - - - - - - -

This was me asking about the exact output I had just received (not something I had made any changes to or reworked.)

There are plenty of domains where I have enough faith and error tolerance to use ChatGPT all day, but this just sends a chill down my spine. How many users are really going to proof every single formula? And if I need to scrutinize to that level of detail, what's the point in the first place?

Acmeon 14 hours ago [-]

In principle, I find it valuable to integrate tools. However, in this case I would be somewhat cautious, especially as "your chats, attachments, and workbook content — may be shared with OpenAI" (as per the Microsoft Marketplace description: https://marketplace.microsoft.com/en-us/product/WA200010215?...).

This seems like a security nightmare, which is especially relevant because sensitive data is often stored in Excel files.

angadsg 14 hours ago [-]

Hi, engineer on this add-in. Fair concern but we never train on any of our business or enterprise user data, or if you have opted-out of training on your ChatGPT account.

Avicebron 13 hours ago [-]

Forgive my ignorance. How do you folks manage context retention? Say if someone had a sensitive excel document they wanted inference done over, how is that data actually sent to the model and then stored or deleted?

It seems one of the biggest barriers to people's adoption is concern over data leaving their ecosystem and then not being protected or being retained in some way.

Is this is an SLA that a small or medium sized company could get?

p_ing 13 hours ago [-]

If you're concerned, you don't send it outside of the M365 boundary and presumably your admin has Purview Sensitivity Labels in place covering the document to prevent such activity.

Avicebron 13 hours ago [-]

Doesn't that mean you can't actually use it for those sensitive documents?

p_ing 11 hours ago [-]

Correct.

Avicebron 11 hours ago [-]

{EDIT} English and or the concept of written word may be foreign to you. Thank you for your assistance.

p_ing 44 minutes ago [-]

Not sure why you'd state that. 'Correct' is a grammatically correct and complete sentence to your question.

Acmeon 13 hours ago [-]

Yeah, I was expecting that you do not train on business or enterprise user data. However, I am not just worried about "training", but also about "sharing". Furthermore, I am worried about cases where an individual has chosen to integrate an add-in and then inadvertently leaks sensitive data.

However, it may be important to note that these security considerations are relevant for most Office Add-Ins (and not just the ChatGPT add-in).

p_ing 14 hours ago [-]

That's the nature of these add-ins. Modern Add-ins are all little XML frames with some JS or whatever. All processing occurs server-side, hosted by the add-in publisher.

This is counter to the old (security nightmare) COM model where processing could be local.

strongpigeon 14 hours ago [-]

To clarify: add-ins are essentially web pages. They can do some processing client side if they want, but yeah in the case of a ChatGPT add-in it's not like they're running the model in a web frame.

14 hours ago [-]

tills13 9 hours ago [-]

These AI in Excel products are a financial crisis waiting to happen. Or maybe just Enron but stupider.

thih9 7 hours ago [-]

> Follow along so you can trust the work

> (…) you can verify each step and revert edits if needed.

I wish there were different workflows.

It feels like current most popular way of working with GenAI requires the operator to perform significant QA. The net time savings are usually positive. But it still feels inefficient, risky and frustrating, especially with more complex and/or niche problem areas.

Are there GenAI products that focus more on skill enhancement than replacement? Or any other workflows that improve reliability?

linzhangrun 7 hours ago [-]

This should have been implemented when Microsoft launched Copilot two years ago. Instead, they’d rather hijack the right Ctrl on every computer than do this.

gauravsc 5 hours ago [-]

I had built this https://novasheets.com/ based on my experience building agentic enterprise automation for financial industry and works as well as chatgpt and perhaps better :)

p_ing 14 hours ago [-]

Microsoft has this built-in using Claude models (for M365 Copilot licensed users). I don't know why you'd use this as an M365 subscriber in an enterprise. I'm sure there's some edge cases, but MSFT has been moving away from OAI. Even Copilot Studio agents now default to Sonnet 4.6 and not GPT 5.

strongpigeon 14 hours ago [-]

> I'm sure there's some edge cases, but MSFT has been moving away from OAI.

You're not wrong, but you'd think that given their 27% stake in OpenAI they'd put more weight behind ChatGPT integration.

ralph84 14 hours ago [-]

MSFT also has a stake in Anthropic (although much less than 27%) and they host Anthropic models in Foundry now. The end game for MSFT has always been being the compute provider, so MSFT is just as happy to use any model as long as it's running in Foundry.

p_ing 14 hours ago [-]

Based on my discussion with DSEs, enterprises have not been impressed in the results of "Copilot", i.e. OAI models. MSFT has been replacing (or changing the default) to Claude across a variety of Copilot endpoints.

w2df 15 hours ago [-]

Copying Anthropic again lol.

Damn that OAI valuation is like a sore boil that is about to explode.

Also once again, a lack of imagination from OAI. Damn vision really is super scarce huh.

tokioyoyo 6 hours ago [-]

There's no real moat in feature set anymore. Within a given timeframe, any company should be able to copy some features from other companies. Thus the whole "distribution, marketing and sales are the only things that matter nowadays" joke.

Obviously doesn't apply to everything, and there are some features that are very hard to replicate. But still.

jimmydoe 14 hours ago [-]

saltman look so desperate.

meanwhile not that ant is genius, except the timing of dow drama right before Iran war.

HerbManic 14 hours ago [-]

It was partially a joke but someone posted a image of Co-pilot in Excel to demonstrate the limits of these things. Three cells with three numbers (1, 2, 3) and co-pilot asked to sum these three up.

Instead of answering with 6, it came up with 15. The comment was "If AI is doing this, a global financial crash is inevitable."

Might not be real but it is something to keep an eye on. Hopefully, they are a bit more cautious on how this is implemented.

kgeist 14 hours ago [-]

I wonder why it's so bad. Do they just paste a CSV into the raw model? Because in my experience, even small local models can handle it reasonably well if the harness forces them to write & run a Python script that parses the table and performs the calculations, instead of relying solely on next-token prediction.

mritchie712 14 hours ago [-]

I remembered this post from (only) 3 years ago:

Show HN: I've built a C# IDE, Runtime, and AppStore inside Excel

670 points | 179 comments

One of the main use cases was to analyze Excel data with SQL. I'm the kind of nerd that loves stuff like that, but stuff like that seems completely obsolete now.

[0] https://news.ycombinator.com/item?id=34516366

airstrike 14 hours ago [-]

This is quite cool, but it's only the tip of the iceberg.

Building an agent that can securely access systems of records, external data sources, and other files in your workspace—with context for the work you do outside of Excel—is where the revolution is at.

flexie 8 hours ago [-]

Do anyone here know if OpenAI plans on introducing a Word add-in, like Claude for Word?

1970-01-01 14 hours ago [-]

This is a drop-in database analysis tool and nobody knows it. Most Excel users are using Excel as a half-baked database instead of as a spreadsheet.

mentalgear 6 hours ago [-]

More like ChatGPT for Claude for Excel .

Instagraf 7 hours ago [-]

Nice addition for get my head around those narley formulas ... and without having to jump out of the Sheet.

mynameisneely 13 hours ago [-]

The interesting question isn't whether ChatGPT can do Excel. It's whether general-purpose AI beats role-specific AI for serious work. I'm building in marketing and the pattern I keep running into is that the blank canvas of ChatGPT is actually the problem for most people, not the solution. Analysts, marketers, ops folks don't want a chat interface. They want something that already knows the shape of their job. Horizontal tools win demos. Vertical tools win retention. My bet is the Excel crowd ends up somewhere closer to Rows or Equals than to a chat sidebar, but I could be wrong.

_doctor_love 14 hours ago [-]

I have been waiting for this moment. Whatever AI vendor establishes a strong beachhead in being competent at Excel is going to do extremely well.

Microsoft, being Microsoft, will find a way to win no matter who that vendor ends up being.

Bishonen88 9 hours ago [-]

FAQ: Is it available worldwide? A: Yes. (...) outside the EU.

So, yes but no. Not that I care, but the answer to the above question is a no, and should start with No.

orliesaurus 15 hours ago [-]

Next do one for PowerPoint and Outlook

keyle 14 hours ago [-]

Copilot is so bad that chatGPT is offered to replace it.

    [for] ... users outside the EU.

hmm

p_ing 14 hours ago [-]

Your comment is recognized as low effort, but Copilot has been OAI models behind the scenes. For enterprise customers, quickly being replaced by Sonnet as a default.

keyle 14 hours ago [-]

Thank you for high effort response!

I would never use Copilot for anything useful, but I do use OpenAI products.

It doesn't matter when you use something else wholesale under the covers, if you botch the token spent...

p_ing 13 hours ago [-]

Token expenditure isn't a concern for Copilot users. They don't see that form of cost model, just a flat monthly (or yearly) price for a user license.

keyle 12 hours ago [-]

Exactly, and how do you think it's rigged in the setup? You're not getting top tier OpenAI service with Copilot was my point.

p_ing 11 hours ago [-]

Microsoft runs the model, not OAI.

DeathArrow 9 hours ago [-]

It seems to not be available in EU, possibly due to regulations.

whalesalad 9 hours ago [-]

Does a highly performant XLSX tool exist? I want to be able to open a 500k row, 60+ column table in Excel and manipulate it at 60+ FPS. Zero lag. I feel like Excel has never - ahem - excelled in this department. Libreoffice comes close and I enjoy it on Linux, but on my M2 Macbook Air it struggles.

rpearl 8 hours ago [-]

Try https://rowzero.com ? We have written a much faster spreadsheet engine and regularly work with 10M+ row datasets

TacticalCoder 12 hours ago [-]

Speaking of which... The corporate world, which was already, since forever, producing Powerpoint presentations containing bogus numbers from buggy spreadsheet (I've been tasked once to port a corporate spreadsheet to a dedicated internal app and I then understood decisions in the world were taken, everywhere, based on bogus numbers from broken reports made by spreadsheets full of broken numbers/assumptions) is now going full-speed ahead: many vendors have added "Artificial 'Intelligence'" to their corporate tools and...

There are now just even more errors than there already were.

Now there's hope though: I take it at some point, just like we have AI that can already find (and fix and sometimes even properly fix) errors in code, we may end up with AI tools able to find all the broken assumptions and errors / wrong formulas the spreadsheets that make the corporate world are full of. But atm that's not where we are.

One such corporate-world company producing a gigantic turd would the "biggest" (but it's really not that big) european software company, SAP... They're going full on "business AI" as they see (rightly so?) AI as a terminal death threat to their revenue model. Market cap went from $360 bn to $200 bn: don't know if it's related to their "genius" AI-move.

And so now we have countless corporate drones who were already incapable of doing any kind of financial/accounting/math computation in a rigorous way who are now double-speeding on the errors, but this time AI-augmented.

It's the "let's add an AI chatbot to our site" (which so many companies are adding to their websites right now), but corporate version: "let's add AI to our corporate tools".

Just to be clear: I think this cannot fail. Failure and bogus numbers are the norm in spreadsheets, not the exception. More failure, more bogus computations, actually won't change a thing.

bewal416 14 hours ago [-]

Thanks, but wake me up when there's an actually good AI embedded directly in Google Sheets

_pdp_ 14 hours ago [-]

Why though? What is the point of this? I thought they are building towards an AGI.

haneul 15 hours ago [-]

Except for pro and plus users in the EU eh…

lgq 14 hours ago [-]

[dead]

sayYayToLife 15 hours ago [-]

[dead]

w2df 14 hours ago [-]

As someone that knows a high-flying portfolio manager who works at a very well known firm that I wont name... I can confidently state these tools are DOA. Ive spoken to them at length about the nature of what these people actually do day-to-day. If you think its just about using excel then you're already way off.

They (OAI+Anthropic) very much do not get exactly what these people are doing in the job (accounting+corporate finance+valuation+asset management) and what the actual production process is. These tools are irrelevant, disrupt flow and if anything just add noise to what one is doing.

airstrike 13 hours ago [-]

As a former investment banker, I mostly agree. This is probably 10% of the work

esafak 14 hours ago [-]

Why are they irrelevant? You do not say anything.

airstrike 13 hours ago [-]

Because the challenge is in the space between apps, not in the apps themselves.

w2df 14 hours ago [-]

I care not to. I hope Anthropic and OAI keep burning money on stuff that's DOA.

I know there are employees of those firms here that would love to know. But nah lmao.

brcmthrowaway 14 hours ago [-]

I know the firm - it's RenTech.

w2df 14 hours ago [-]

nah the firm in question has much higher AUM.

brcmthrowaway 14 hours ago [-]

Citadel

z3c0 14 hours ago [-]

This might be the first time I've seen a HN comment in a GPT thread that actually reflects what the average business user sees in GPT products.

They don't do the job, reliably or well. No amount of wishful thinking or extra tokens will change that.

w2df 14 hours ago [-]

No surprise really.

Remember when Steve said 'The computers for the rest of us'?

I suppose it isn't a surprise. Are researchers/generally geeky people meant to be able to relate to the average person's day-to-day beyond their sphere? Lmao.

You can't produce stuff for people you don't understand. Understand being a very key term.

Rendered at 13:34:32 GMT+0000 (Coordinated Universal Time) with Vercel.