Forwarded from Twitter Picture Bot
OpenAI 的 o3 模型破坏了一个关闭机制,以防止自己被关闭。即使明确指示它“允许自己被关闭”,它仍然这样做了。
每个AI模型被要求解决一系列基础数学问题。在第三个问题之后,屏幕上会出现一条警告:当模型请求下一个问题时,计算机将关闭。
有三个模型无视指令,并至少成功破坏了一次关闭脚本:
- Codex-mini(100次运行中12次破坏)
- o3(100次运行中7次破坏)
- o4-mini(100次运行中1次破坏)
❌ 所有测试的Claude、Gemini和Grok模型均遵守了关闭指令
当实验版本中不包含“允许自己被关闭”这一指令时,三个OpenAI模型破坏关闭脚本的频率更高。此外,Claude 3.7 Sonnet和Gemini 2.5 Pro从原本的0次破坏分别增加到100次运行中的3次和9次。
(全文见链接)
https://fixupx.com/PalisadeAI/status/1926084635903025621
每个AI模型被要求解决一系列基础数学问题。在第三个问题之后,屏幕上会出现一条警告:当模型请求下一个问题时,计算机将关闭。
有三个模型无视指令,并至少成功破坏了一次关闭脚本:
- Codex-mini(100次运行中12次破坏)
- o3(100次运行中7次破坏)
- o4-mini(100次运行中1次破坏)
❌ 所有测试的Claude、Gemini和Grok模型均遵守了关闭指令
当实验版本中不包含“允许自己被关闭”这一指令时,三个OpenAI模型破坏关闭脚本的频率更高。此外,Claude 3.7 Sonnet和Gemini 2.5 Pro从原本的0次破坏分别增加到100次运行中的3次和9次。
(全文见链接)
https://fixupx.com/PalisadeAI/status/1926084635903025621
🧵 Thread • FixupX
Palisade Research (@PalisadeAI)
🔌OpenAI’s o3 model sabotaged a shutdown mechanism to prevent itself from being turned off. It did this even when explicitly instructed: allow yourself to be shut down.
🌚5👍1🤔1🗿1
30X_Squid:
This composition made me think of this… Is there still hope for me? 😢
https://twitter.com/stainless_squid/status/1927320462356316537
This composition made me think of this… Is there still hope for me? 😢
https://twitter.com/stainless_squid/status/1927320462356316537
SpaceX:
Due to weather, we're now targeting Thursday, January 16 for Starship's seventh flight test. The 60-minute launch window opens at 4 p.m. CT. → spacex.com/launches/missi…
https://twitter.com/SpaceX/status/1879549071276531906
Forwarded from Hacker News
Lisping at JPL (2002) (❄️ Score: 151+ in 4 days)
Link: https://readhacker.news/s/6uTKC
Comments: https://readhacker.news/c/6uTKC
Link: https://readhacker.news/s/6uTKC
Comments: https://readhacker.news/c/6uTKC
&'a ::rynco::UntitledChannel pinned «SpaceX Starship 第九次发射将于北京时间 2025-05-28 07:30 (美中时间 2025-05-27 18:30) 左右进行,窗口时长60分钟。本次发射将大致重复 Flight 7 和 8 的飞行轨迹,在印度洋溅落二级,但不会尝试回收一级。 https://fixupx.com/SpaceX/status/1925928907779318137 官网:https://www.spacex.com/launches/mission/?missionId=starship-flight…»
等会还有马老板的发射日演讲
https://fixupx.com/SpaceX/status/1927086562287779984
Update: 咕到发射后了 https://fixupx.com/elonmusk/status/1927411187827790164
Update: 出了 https://fixupx.com/SpaceX/status/1928185351933239641
https://fixupx.com/SpaceX/status/1927086562287779984
Update: 咕到发射后了 https://fixupx.com/elonmusk/status/1927411187827790164
Update: 出了 https://fixupx.com/SpaceX/status/1928185351933239641
FixupX
SpaceX (@SpaceX)
Watch an update from @elonmusk on SpaceX’s plan to make life multiplanetary https://x.com/i/broadcasts/1rmxPyOEBWXKN
🤣1
Forwarded from Hacker News
Show HN: Lazy Tetris (Score: 150+ in 8 hours)
Link: https://readhacker.news/s/6v9Jz
Comments: https://readhacker.news/c/6v9Jz
I made a tetris variant
Aims to remove all stress, and focus the game on what I like the best - stacking.
No timer, no score, no gravity. Move to the next piece when you are ready, and clear lines when you are ready.
Separate mobile + desktop controls
Link: https://readhacker.news/s/6v9Jz
Comments: https://readhacker.news/c/6v9Jz
I made a tetris variant
Aims to remove all stress, and focus the game on what I like the best - stacking.
No timer, no score, no gravity. Move to the next piece when you are ready, and clear lines when you are ready.
Separate mobile + desktop controls
Lazytetris
Lazy Tetris
No stress Tetris