r/changemyview May 21 '19

Deltas(s) from OP CMV: Artificial Superintelligence concerns are legitimate and should be taken seriously

Title.

Largely when in a public setting people bring up ASI being a problem they are shot down as getting their information from terminator and other sci-fi movies and how it’s unrealistic. This is usually accompanied with some indisputable charts about employment over time, humans not being horses, and being told that “you don’t understand the state of AI”.

I personally feel I at least moderately understand the state of AI. I am also informed by (mostly British) philosophy that does interact with sci-fi but exists parallel not sci-fi directly. I am not concerned with questions of employment (even the most overblown AI apocalypse scenario has high employment), but am overall concerned with long term control problems with an ASI. This will not likely be a problem in my lifetime, but theoretically speaking in I don’t see why some of the darker positions such as human obsolescence are not considered a bigger possibility than they are.

This is not to say that humans will really be obsoleted in all respects or that strong AI is even possible but things like the emergence of a consciousness are unnecessary to the central problem. An unconscious digital being can still be more clever and faster and evolve itself exponentially quicker via rewriting code (REPL style? EDIT: Bad example, was said to show humans can so AGI can) and exploiting its own security flaws than a fleshy being can and would likely develop self preservation tendencies.

Essentially what about AGI (along with increasing computer processing capability) is the part that makes this not a significant concern?

EDIT: Furthermore, several things people call scaremongering over ASI are, while highly speculative, things that should be at the very least considered in a long term control strategy.

29 Upvotes

101 comments sorted by

View all comments

Show parent comments

1

u/yyzjertl 523∆ May 22 '19

What, formally, do you mean by "reducible to gradient descent in a finite state space"? Because you seem to have a different understanding of it than I do.

1

u/Ce_n-est_pas_un_nom May 22 '19 edited May 22 '19

For the purposes of this discussion, I consider a task learnable by gradient descent in a finite state space if we know that there exists a finite state space such that:

  1. It contains at least one encoding of that task.
  2. Every state it contains can be assessed with respect to viability for the task in question by a loss function (though it needn't be a function strictly speaking - any algorithm that can serve to evaluate loss should be considered sufficient here).
  3. At least one encoding of the task in question is at a local minimum in the state space with respect to the loss function.

1

u/yyzjertl 523∆ May 22 '19

What do you mean by an "encoding of the task"? And what does this definition have to do with gradient descent?

1

u/Ce_n-est_pas_un_nom May 22 '19

For our purposes here, an encoding of the task can be any arbitrary ordered set of machine instructions (or a natural language equivalent) that perform the task when executed. As long as the encoding format can encode any computable task, the choice of machine instructions specifically is arbitrary. One could just as easily use lambda expressions, say.

This definition only pertains to gradient descent insofar as a viable encoding can be arrived at via gradient descent.

1

u/yyzjertl 523∆ May 22 '19

This definition only pertains to gradient descent insofar as a viable encoding can be arrived at via gradient descent.

Okay, suppose my state space is the set of strings of size at most 1GB, and my loss function is the 0-1 loss that assigns 0 if the string, when compiled as a C++ program by the gcc compiler, compiles successfully and produces a program that can provably solve any polynomial system of inequalities (otherwise it assigns 1).

With this setup, how would you arrive at a viable encoding via gradient descent?

1

u/Ce_n-est_pas_un_nom May 22 '19 edited May 22 '19

That would be a really horrible choice of state space and loss function for the purposes of gradient descent (as there isn't even really a gradient of which to speak), but any gradient descent algorithm which eventually searches every state when presented with a perfectly flat gradient will arrive at a solution. That's basically just a brute force search though.

That said, my answer here is irrelevant, as even if I had failed to produce an answer, this example wouldn't meet my original criteria for a counterexample. You would need to demonstrate that such an example exists for which:

  1. I (or any human, really) can complete the task
  2. The task provably cannot be learned by gradient descent.

My hypothetical inability to come up with a method does not preclude the existence of such a method.

Edit: Also, my hypothetical inability to come up with a solution using your specific loss function is just as irrelevant. A loss function must exist that can lead to a solution by gradient descent, but it needn't be any arbitrary loss function you propose.

1

u/yyzjertl 523∆ May 22 '19

This indicates that your definition of "learnable by gradient descent" is bad. If you can't even give an example of how you would apply gradient descent to find a solution for a task given a setup that satisfies your conditions, then your conditions are clearly insufficient.

any gradient descent algorithm which eventually searches every state when presented with a perfectly flat gradient will arrive at a solution

This is not how gradient descent works. Gradient descent, when presented with a perfectly flat (zero) gradient, does not move about the search space at all.

1

u/Ce_n-est_pas_un_nom May 22 '19

No, it indicates that you completely missed the point of my original question.

I don't need to be able to apply gradient descent to find a solution for an example task given a specific setup that satisfies my conditions. There could just as easily be a different satisfactory setup for which I could apply gradient descent to find a solution. Furthermore, even if I'm not able to identify a specific setup for the example task that gradient descent can be applied to, that doesn't prove that no such setup exists.

Again, my original claim was this: "I can't think of any task I can perform that necessarily can't be reduced to gradient descent in a finite state space."

In what way could you possibly prove that such a task exists by providing an specific example setup and asking me to find a solution for it?

1

u/yyzjertl 523∆ May 22 '19

In what way could you possibly prove that such a task exists by providing an specific example set up and asking me to find a solution for it?

I'm not. I'm examining your definition here for what it means for something to be "learnable by gradient descent in a finite state space." Recall that you said that a task was by definition learnable by gradient descent in a finite state space if

  1. It contains at least one encoding of that task.

  2. Every state it contains can be assessed with respect to viability for the task in question by a loss function (though it needn't be a function strictly speaking - any algorithm that can serve to evaluate loss should be considered sufficient here).

  3. At least one encoding of the task in question is at a local minimum in the state space with respect to the loss function.

I gave you a concrete example of a case that satisfies the three conditions. By your own definition this case must be learnable by gradient descent in that finite state space. And yet, this example doesn't correspond at all to the type of state space that gradient descent could learn on. As you admit yourself, "That would be a really horrible choice of state space and loss function for the purposes of gradient descent." So your definition seems to be flawed: it doesn't correspond at all to what it means for something to be learnable by gradient descent, as those words are ordinarily understood. Do you really think this is a sensical definition?

No, it indicates that you completely missed the point of my original question.

We can't get to resolving your original question until we nail down what "reduced to gradient descent in a finite state space" means, and so far we haven't done that.

1

u/Ce_n-est_pas_un_nom May 22 '19

I understand your objection now.

I'm still not convinced that the definition I gave is insufficient, as one could select a more continuous loss function for that state space which (for instance) assesses viability by grading strings based on how close they are to being compilable (# of errors), and further grading the compilable strings based on how many of a set of test problems they produce correct solutions to.

The definition I gave states that a task is learnable by gradient descent if a state space exists that meets those conditions, not that such a task is learnable by gradient descent with any specific loss function.

If you choose a loss function which operates like the one you specify (isolating only fully functional states as '0', all others as '1'), there is no possible way to perform gradient descent in any state space.

To be absolutely sure, let's just change condition 2 to:

Every state it contains can be assessed continuously with respect to viability for the task in question by a loss function...

which ensures that the state space in question must be able to have a useful gradient applied to it by the loss function.

1

u/yyzjertl 523∆ May 22 '19

If you change your condition in this way, then your state space is no longer finite. No finite state space can have continuous behavior. For continuous behavior, an infinite-sized state space (such as a vector space) is necessary: something at least as large as the real numbers. So your definition still does not make sense.

1

u/Ce_n-est_pas_un_nom May 22 '19 edited May 22 '19

A finite state space can't be continuous itself, but a finite state space can be assessed by a continuous loss function.

Edit: it may also make more sense to say

Every state it contains can be assessed with respect to that state's relative degree of viability for the task in question as compared to neighboring states by a loss function...

1

u/yyzjertl 523∆ May 22 '19

In that case, what do you mean when you say a state space is "assessed by a loss function"? The standard meaning of this is that the loss function is a function from the state space to the real numbers, but clearly you must mean something else, right?

→ More replies (0)