  1. About perspective normalization, the closer the people is to the camera, the bigger he is. So the weight should be small, which is consistent to the paper. While in paper, the closer people is, the hotter the color, meaning bigger weight of people close to camera.

  2. down-sample the training pics by $\frac{1}{4}$ before training, it wouldn’t change the density?

    delta function to model the image with $N$ heads?

CNN based

2D Gaussian Kernel

  1. The $\sigma$ determines the width of the Gaussian kernel.

[2008-AB Chan]

[chapter3.1 | 3.3 understanding ]

Feature Extraction



Segment Features


[perimeter finding] :

definition : total number of pixels on the segment perimeter, computed with morphological operators.

Perimeter Edge Orientation

definition : for pixels in the edge, applying a set of orientation gaussian filtering to the pixel and maximum output is considered as the orientation of that pixel, where orientation histagram is is a histogram of a set of integer numbers in the range [0…5].

Edge Features



Minkowski Dimension

Texture Features


  • 某种局部序列性不断重复;
  • 非随机排列;
  • 纹理区域内大致为均匀的统一体;

GLCM For Texture Abstracting



假设灰度集合$\{0,1,2,3\}$, 那么共生矩阵大小是$4\times 4$, 对于$\forall i,j\in{0,1,2,3} $ 每个矩阵entry值是$f(i,j|d,\theta)$ , $d$和$\theta$给定。[实现方法] :





[2016-Yingying Zhang]





  1. Downsampling an image

  2. FPS 每秒传输帧数(Frames Per Second)

    FPS是图像领域中的定义,是指画面每秒传输帧数,通俗来讲就是指动画或视频的画面数, 电影以每秒24张画面的速度播放,也就是一秒钟内在屏幕上连续投射出24张静止画面,那么我们就说电影是24fps.

  3. A dynamic texture

    A dynamic texture (DT) is the temporal extension of 2D texture, which is considered as a spatio-temporal generative model for video, which represents video sequences as observations from a linear dynamical system.

Definition : For a frame at time $t$, we have two variables $y_t$ and $x_t$, which encode the frame appearance component and the evolution of the video over time respectively.

  1. Pixel

    图像是由的小方格即所谓的像素(pixel)组成的,这些小方块都有一个明确的位置和被分配的色彩数值,像素是整个图像中不可分割的单位或者是元素。e.g. 图片分辨率为72,即每英寸像素为72,1英寸等于2.54厘米,那么通过换算可以得出每厘米等于28像素(72 / 2.54);又如15x15厘米长度的图片,等于420*420像素的长度。

  2. Grey-Scale Value


  3. Gaussian Filtering (高斯滤波)

    ref1 ref2



  4. Orientation Gaussian Filtering

  1. image modeling

  2. Multivariate Gaussian Distributions

    Blog1 YouTube


    For one variate gaussian distribution, the variance is $Var(x)=E[(x-E(x))(x-E(x))]$, so for the multivariate, the variacne is $Var(x_1,x_2)=E[(x_1-E(x_1))(x_2-E(x_2))]$



