[FIXED] tf_agents dqn fails to initialize

Issue

Even though tf.agents initialize() require no input variables, this line

agent.initialize()

produces this error

TypeError: initialize() missing 1 required positional argument: 'self'

Ive tried agent.initialize(agent) because it apparently wanted self passing in… obviously that didnt work XD

I suspect the problem might be that this line

print(type(agent))   

Produces

<class 'abc.ABCMeta'>

But that might be normal…

##################################

My whole script below is reproducable

###  for 9 by 9 connect 4 board 
#
import tensorflow as tf
from tf_agents.networks import q_network
from tf_agents.agents.dqn import dqn_agent
import tf_agents
import numpy as np

print(tf.__version__)
print(tf_agents.__version__)

import tensorflow.keras


observation_spec  = tf.TensorSpec(   #   observation tensor = the whole board , ideally 0's, 1's , 2's for empty, occupied by player 1 , occupied by player 2
    [9,9],
    dtype=tf.dtypes.float32,
    name=None
)

action_spec  = tf_agents.specs.BoundedArraySpec(    
    [1],                         ### tf_agents.networks.q_network only seems to take an action of size 1  
    dtype= type(1) ,     #tf.dtypes.float64,
    name=None, 
    minimum=0,
    maximum=2
)
#######################################

def make_tut_layer(size):
    return tf.keras.layers.Dense(
        units= size,
        activation= tf.keras.activations.relu,
        kernel_initializer=tf.keras.initializers.RandomNormal(mean=0., stddev=1.)
                                )

def make_q_layer(num_actions):
    q_values_layer = tf.keras.layers.Dense (        # last layer gives probability distribution over all actions so we can pick best action 
        num_actions ,
        activation = tf.keras.activations.relu , 
        kernel_initializer = tf.keras.initializers.RandomUniform( minval = 0.03 , maxval = 0.03),
        bias_initializer = tf.keras.initializers.Constant(-0.2)
                                        ) 
    return q_values_layer;



############################## stick together layers below

normal_layers = []

for i in range(3):
    normal_layers.append(make_tut_layer(81))
q_layer = make_q_layer(9)

q_net = keras.Sequential(normal_layers + [q_layer])

######################################

agent = dqn_agent.DqnAgent
(
    observation_spec,    ### bonus question, why do i get syntax errors when i try to label variables like ---> time_step_spec = observation_spec, gives me SyntaxError: invalid syntax   on the = symbol       
    action_spec,
    q_net,
    tf.keras.optimizers.Adam(learning_rate= 0.001 )

)
eval1 = agent.policy
print(eval1)
eval2= agent.collect_policy
print(eval2)
print(type(agent))  
agent.initialize()
print(" done ")

And produces the output.

2.9.2
0.13.0
<property object at 0x000001A13268DA90>
<property object at 0x000001A13268DAE0>
<class 'abc.ABCMeta'>
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [53], in <cell line: 73>()
     71 print(eval2)
     72 print(type(agent))
---> 73 agent.initialize()
     74 print(" done ")

TypeError: initialize() missing 1 required positional argument: 'self'

Is my agents type ok? should it be <class ‘abc.ABCMeta’>

Why does my agent fail to initialize?

Solution

I guess, answer is very simple: you can’t just move ( to the next line for the function call.

What you’re effectively doing:

make agent an alias for dqn_agent.DqnAgent (the class)

agent = dqn_agent.DqnAgent

calculate an expression and discard its result

(
    observation_spec,
    action_spec,
    q_net,
    tf.keras.optimizers.Adam(learning_rate= 0.001 )
)

that also answers the bonus question – since it’s not a function call, there are no named parameters, and assignments are not allowed in an expression (this is what python says).

Put the opening bracket right after dqn_agent.DqnAgent, and it should work:

agent = dqn_agent.DqnAgent(
    observation_spec,
    action_spec,
    q_net,
    tf.keras.optimizers.Adam(learning_rate= 0.001 )
)

Answered By – LogicDaemon

Answer Checked By – Timothy Miller (Easybugfix Admin)

Leave a Reply

(*) Required, Your email will not be published