# [Tutor] separately updating parameters

Zsolt Turi zsoltturi at gmail.com
Thu May 31 16:43:53 CEST 2012

```Dear Pythonists,

I'm using Python 2.7. on Win 7.

Problem description:
Currently, I am working on a reinforcement learning paradigm, where I would
like to update Qa values with
alfaG [if decision_input = 1 and feedback_input = 1] or with
alfaL [ if decision_input = 1 and feedback_value = 0].

(1) So, I have two lists for input (with two values) :

decision_input = [1,1] - this could be 1,2,3,4,5,6
feedback_input = [1,0] - the value is either 1 or zero

(2) The equation is the following

for gain: Qa = Qa+(alfaG*(feedback_input-Qa)) thus, I would like to
use alfaG only if the i-th element of feedback_input is 1
for lose: Qa = Qa+(alfaL*(feedback_input-Qa)) thus, only if the
i-th element of feedback_input is zero

Qa value is initialized to zero.

(3) Incrementing alfaG and alfaL independently after updating the Qa
value

alfaG = 0.01 - initial value
alfaL = 0.01 - initial value

(4) The problematic code :(

decision_input = [1,1]
feedback_input = [1,0]
a = []
alfaG = 0.01
alfaL = 0.01
value = 0.04

for i in range(len(decision_input)):
if decision_input[i] == 1 and feedback_input[i] == 1:
while alfaG < value:
Qa = 0
for feedb in feedback_input:
Qa = Qa+(alfaG*(feedb-Qa))
a.append(Qa)
if decision_input[i] == 1 and feedback_input[i] == 0:
while alfaL < value:
for feedb in feedback_input:
Qa = Qa+(alfaL*(feedb-Qa))
a.append(Qa)
alfaL += 0.01
alfaG += 0.01
print a

after this, I've got the following output:
[0.01, 0.099], [0.02, 0.0196], [0.03, 0.0291]

(5) I have no idea, how to get the following output:

[0.01, 0.099],   [0.01, 0.098],   [0.01, 0.097]       -->thus: alfaG =
0.01, alfaL = 0.01, 0.02, 0.03
[0.02, 0.0198], [0.02, 0.0196], [0.02, 0.0194]     -->thus: alfaG = 0.02,
alfaL = 0.01, 0.02, 0.03
[0.03, 0.0297], [0.03, 0.0294], [0.03, 0.0291]     -->thus: alfaG = 0.03,
alfaL = 0.01, 0.02, 0.03

Since both alfaG and alfaL have 3 values, I have 3x3 lists.

Does anyone have an idea, how to modify the code?

Best regards,
Zsolt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20120531/bc353b15/attachment.html>
```