“Would catching your AIs trying to escape convince AI developers to slow down or undeploy? ” by Buck
MP3•Home de episódios
Manage episode 436387528 series 3364758
Conteúdo fornecido por LessWrong. Todo o conteúdo do podcast, incluindo episódios, gráficos e descrições de podcast, é carregado e fornecido diretamente por LessWrong ou por seu parceiro de plataforma de podcast. Se você acredita que alguém está usando seu trabalho protegido por direitos autorais sem sua permissão, siga o processo descrito aqui https://pt.player.fm/legal.
Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.I often talk to people who think that if frontier models were egregiously misaligned and powerful enough to pose an existential threat, you could get AI developers to slow down or undeploy models by producing evidence of their misalignment. I'm not so sure. As an extreme thought experiment, I’ll argue this could be hard even if you caught your AI red-handed trying to escape.
Imagine you're running an AI lab at the point where your AIs are able to automate almost all intellectual labor; the AIs are now mostly being deployed internally to do AI R&D. (If you want a concrete picture here, I'm imagining that there are 10 million parallel instances, running at 10x human speed, working 24/7. See e.g. similar calculations here). And suppose (as I think is 35% likely) that these models are egregiously [...]
---
First published:
August 26th, 2024
Source:
https://www.lesswrong.com/posts/YTZAmJKydD5hdRSeG/would-catching-your-ais-trying-to-escape-convince-ai
---
Narrated by TYPE III AUDIO.
…
continue reading
Imagine you're running an AI lab at the point where your AIs are able to automate almost all intellectual labor; the AIs are now mostly being deployed internally to do AI R&D. (If you want a concrete picture here, I'm imagining that there are 10 million parallel instances, running at 10x human speed, working 24/7. See e.g. similar calculations here). And suppose (as I think is 35% likely) that these models are egregiously [...]
---
First published:
August 26th, 2024
Source:
https://www.lesswrong.com/posts/YTZAmJKydD5hdRSeG/would-catching-your-ais-trying-to-escape-convince-ai
---
Narrated by TYPE III AUDIO.
365 episódios