Q-Learning needs to maintain a Q-table that an agent uses to find the best action to take given a state. However, producing and updating a Q-table can become ineffective in big state space environments. While in this post, we are going to create a Deep Q Neural Network to improve Q Learning. Instead of using a Q-table, we’ll implement a Neural Network that takes a state and approximates Q-values for each action based on that state. ref
WebSite Making
We are going to write a Django project so that we can learn the basic concepts and operations. Our website is a basic poll application, which consist of two parts:
- A pyblic site that lets people view polls and vote in them
- An admin site that lets you add, change, and delete polls.
RL-Reinforcement Learning
Reinforcement learning is useful when you have no training data or specific enough expertise about the problem. On a high level, you know WHAT you want, but not really HOW to get there. Luckily, all you need is a reward mechanism, and the reinforcement learning model will figure out how to maximize the reward, if you just let it “play” long enough. This is analogous to teaching a dog to sit down using treats. At first the dog is clueless and tries random things on your command. At some point, it accidentally lands on its butt and gets a sudden reward. As time goes by, and given enough iterations, it’ll figure out the expert strategy of sitting down on cue.
RL-Double Q-Learning
Double Q-Learning in Reinforcement Learning.
DP-Object Detection
Object detection is one of the popular computer vision tasks, i.e., image classification, object detection, object tracking, image segmentation, image caption and image generation. The main of object detection is to find out all the objects in a image, their positions and corresponding confidence.
In brief, in order to detect objects, we first need to generate region proposals, then to classify the object class and detect the bounding box.
DP-RNN
Recurrent Neural Networks, which are a type of artificial neural network designed to recognize patterns in sequences of data, such as text, genomes, handwriting, the spoken word, or numerical times series data emanating from sensors, stock markets and government agencies. These algorithms take time and sequence into account, they have a temporal dimension.
GAN Metrics
Several metrics to evaluate GAN, including Inception Score.
BroadReading
Here are lists of some interesting knowledge I picked up in daily study.
Daily Paper Reading
Some interesting papers that I read or am about to read.