The present paper discusses an optimal learning control method using reinforcement learning for biological systems with a redundant actuator. It is difficult to apply reinforcement learning to biological control systems because of the redundancy in muscle activation space. We solve this problem with the following method. First, we divide the control input space into two subspaces according to a priority order of learning and restrict the search noise for reinforcement learning to the first priority subspace. Then the constraint is reduced as the learning progresses, with the search space extending to the second priority subspace. The higher priority subspace is designed so that the impedance of the arm can be high. A smooth reaching motion is obtained through reinforcement learning without any previous knowledge of the arm's dynamics.