DQNPlanner¶

`python_motion_planning.local_planner.dqn.DQNPlanner` ¶

Bases: LocalPlanner

Class for Fully Connected Deep Q-Value Network (DQN) motion planning.

Parameters:

Name	Type	Description	Default
`start`	`tuple`	start point coordinate	required
`goal`	`tuple`	goal point coordinate	required
`env`	`Env`	environment	required
`heuristic_type`	`str`	heuristic function type	`'euclidean'`
`hidden_depth`	`int`	the number of hidden layers of the neural network	required
`hidden_width`	`int`	the number of neurons in hidden layers of the neural network	required
`batch_size`	`int`	batch size to optimize the neural networks	`2000`
`buffer_size`	`int`	maximum replay buffer size	`1000000.0`
`gamma`	`float`	discount factor	`0.999`
`tau`	`float`	Softly update the target network	`0.001`
`lr`	`float`	learning rate	`0.0001`
`train_noise`	`float`	Action noise coefficient during training for exploration	`0.1`
`random_episodes`	`int`	Take the random actions in the beginning for the better exploration	`50`
`max_episode_steps`	`int`	Maximum steps for each episode	`200`
`update_freq`	`int`	Frequency (times) of updating the network for each step	`1`
`update_steps`	`int`	Update the network for every 'update_steps' steps	`1`
`evaluate_freq`	`int`	Frequency (times) of evaluations and calculate the average	`50`
`evaluate_episodes`	`int`	Evaluate the network every 'evaluate_episodes' episodes	`50`
`actor_save_path`	`str`	Save path of the trained actor network	required
`critic_save_path`	`str`	Save path of the trained critic network	required
`actor_load_path`	`str`	Load path of the trained actor network	required
`critic_load_path`	`str`	Load path of the trained critic network	required
`**params`		other parameters can be found in the parent class LocalPlanner	`{}`

Examples:

Python Console Session

>>> from python_motion_planning.utils import Grid
>>> from python_motion_planning.local_planner import DDPG
>>> plt = DDPG(start=(5, 5, 0), goal=(45, 25, 0), env=Grid(51, 31),
    actor_save_path="models/actor_best.pth", critic_save_path="models/critic_best.pth")
>>> plt.train(num_episodes=10000)

load the trained model and run¶

Python Console Session

>>> plt = DDPG(start=(5, 5, 0), goal=(45, 25, 0), env=Grid(51, 31),
    actor_load_path="models/actor_best.pth", critic_load_path="models/critic_best.pth")
>>> plt.run()

`buildActionSpace()` ¶

Action space consists of 25 uniformly sampled actions in permitted range and 25 randomly sampled actions.

`evaluate_policy()` ¶

Evaluate the policy and calculating the average reward.

Returns:

Name	Type	Description
`evaluate_reward`	`float`	average reward of the policy

`optimize_model()` ¶

Optimize the neural networks when training.

Returns:

Name	Type	Description
`actor_loss`	`float`	actor loss
`critic_loss`	`float`	critic loss

`plan()` ¶

Deep Deterministic Policy Gradient (DDPG) motion plan function.

Returns:

Name	Type	Description
`flag`	`bool`	planning successful if true else failed
`pose_list`	`list`	history poses of robot

`reset(random_sg=False)` ¶

Reset the environment and the robot.

Parameters:

Name	Type	Description	Default
`random_sg`	`bool`	whether to generate random start and goal or not	`False`

Returns:

Name	Type	Description
`state`	`Tensor`	initial state of the robot

`reward(prev_state, state, win, lose)` ¶

The state reward function.

Parameters:

Name	Type	Description	Default
`state`	`Tensor`	current state of the robot	required
`win`	`bool`	whether the episode is won (reached the goal)	required
`lose`	`bool`	whether the episode is lost (collided)	required

Returns:

Name	Type	Description
`reward`	`float`	reward for the current state

`run()` ¶

Running both plannig and animation.

`step(state, action)` ¶

Take a step in the environment.

Parameters:

Name	Type	Description	Default
`state`	`Tensor`	current state of the robot	required
`action`	`Tensor`	action to take	required

Returns:

Name	Type	Description
`next_state`	`Tensor`	next state of the robot
`reward`	`float`	reward for taking the action
`done`	`bool`	whether the episode is done

`train(num_episodes=10000)` ¶

Train the model.

Parameters:

Name	Type	Description	Default
`num_episodes`	`int`	number of episodes to train the model	`10000`

DQNPlanner¶

python_motion_planning.local_planner.dqn.DQNPlanner ¶

load the trained model and run¶

buildActionSpace() ¶

evaluate_policy() ¶

optimize_model() ¶

plan() ¶

reset(random_sg=False) ¶

reward(prev_state, state, win, lose) ¶

run() ¶

step(state, action) ¶

train(num_episodes=10000) ¶

`python_motion_planning.local_planner.dqn.DQNPlanner` ¶

`buildActionSpace()` ¶

`evaluate_policy()` ¶

`optimize_model()` ¶

`plan()` ¶

`reset(random_sg=False)` ¶

`reward(prev_state, state, win, lose)` ¶

`run()` ¶

`step(state, action)` ¶

`train(num_episodes=10000)` ¶