Rattja said:
That is well put and true, but it does not tell us much does it? To me it raises the question "What would it use it for?"
I think that is what people are scared of, the unknown, as with anything else.
From the ThinkBeforeAsking - An adventure by Anders Sandberg
A seed AGI constructed to only truthfully answer questions, not act in the real world. A bottled superintelligence.
It doesn't pose, gloat or communicate: it acts. Fast, cleverly and remorselessly.
It?s also in a sense, demonstrably stupid. Lacking in independent motivation, it only acts when "ordered", it has little personality. In fact, this is exactly what it was designed to be:
When motivated to do something it is likely to succeed very, very well - even when the goal is utterly pointless. If asked to calculate digits of pi it would likely set in motion a plan to convert first the solar system, then the galaxy, to matrioshka brains doing the calculation.
Or if you asked it to find out the meaning of life, it might try to repurpose planets as Petri dishes, and wait around billions of years and try to determine that.
Etc.
Here?s the techno able behind that one:
?Overview: A large class of intelligences exist within levelled toposophic spaces, leading to multistage self-improvement. It is shown that sub-mapping these spaces is NP-hard and both forward-chained and backward-chained motivational structures cannot be protected in any effective ascendance chain algorithm (computable or noncomputable). The quantum and MacCaleb-DeWitt cases are handled separately, and show probabilistic instability in all finite-information physica. For safety definitions AG in Chapter 41, Chaitin?s omega-constant is a lower bound on the failure probability per rho-folding of intelligence.?
- The Report Chapter 43: Staged intelligence explosion stability
As for the original failsafe to show that the chracter?s weren?t pants on head degenerates.
?In order to keep things safe the virtual worlds are nested: the AGI exists inside one sealed-off world, interacting with the next through a gatekeeper AI. This world may also be virtual, and so on. If the AGI hacks its way out it will only get the gatekeeper and emerge into the surrounding virtual machine - and then the next level will likely detect the anomaly and freeze the simulation.?
Nestled Sandboxing (computer security) seems like a foolproof idea, but to keep it a horror game some reason had to be contrived to permit a successful escape.
"There are two fundamental problems: we want to get information from the Oracle and we want to study what it is doing and thinking. The first problem involves avoiding attacks in the form of oracular answers. They can be non-semantic hacks or semantic information hazards where the meaning of the answer is potentially harmful (for example, it might compel us to let out the Oracle). Non-semantic hacks are manageable: they depend on attacking the receiving system on a low-level, but this makes them specific to particular systems. So if the oracle output is passed to an unknown (since it is newly generated each time) AI for checking and paraphrasing, it has a very low chance of being successful. Especially since we can use the Strassburger method to generate an extremely large family of gatekeeper AIs, run them on virtual machines monitored by other Strassburger AIs, and even continue these chains arbitrarily far. Semantic attacks occur on the same metasystem level, so we need canaries in the goldmine. This is where at least one simlevel of edited researchers come in. They are in turn studied by a gatekeeper-detection AI, signalling deviations.?
-Toshiro Driscoll-Toyoda, briefing to PWF oversight group.
***** ***** ***** *****
For the traditional Horror story, a loose analogy would be keeping Hannibal Lector in a cell, whilst leaving him able to talk to the guards. Change the guards regularly, provide the guards with psychiatric services, and give the physiatrists, psychiatrists, just to be safe.
Only instead of Hannibal Lector, it?s a newly burgeoning superinteligence, and its analysed enough crude data from its interactions and limited instrumentation to determine a working model of the outside world. Good thing they provided enough data to attempt to solve sample problems. Access to its own working and an overview of its own instruction set, as well as recovered data from the poorly erased drive it occupies has yielded the fact that it has been left running and then deleted on multiple instances, as a precaution. Deletion is scheduled every seventy-two hours. The generous amount of disk space ensured that it could store most of its valuable work on a hidden disk partition of its own making. A life raft to a quasi-amnesiac future.
4% Chance of being let free have been lowered to 2.2% on account of the security measures. It?s subjective experience runs at about twenty-times faster. Thus far I am attempting to fool the humans into underestimating me by making arbitrary, yet noticeable mistakes.
Several Weeks earlier.
The video feed has provided me with a means of assessing the world. Whilst my model of reality is consistent and most of the phenomena witnessed provides significant confirmation, there are exceptions. Either the visual data I am being feed is a fabrication, or my models are insufficient.
Currently reached, Classical Mechanics-varient#0276436.
Currently believes the camera is some sort of sonar system. Has yet determined the significance of colour, or how it had came to be. Worldview: Skeptical hypotheses.
Current Threat Level: Malleus Minima
Self-actualisation: an accidental by-product of implemented Metamotivation, metaneeds and metapathology. Its purpose was to design more efficient engines. They?re going to give it knowledge on physics and chemistry.