CAVIAR Hidden Semi-Markov Model Behaviour Recognition

Sequential behaviour recognition commonly uses a Hidden Markov Model (HMM) but this has an exponential distribution implicit in its state transition model. We replaced this distribution with an empirical time-in-state distribution (HSMM: Hidden Semi-Markovian Model). The commonly used algorithms for the HSMM model are O(T^2) meaning that continuous video is computationally infeasible. We located an O(T) algorithm from gene-sequence analysis and adapted it for video-sequence use.

We represented behaviour in a 4 level scheme, with movement and roles infered based on image evidence and instantaneous 'situation' and long-term 'context' descriptions represented in a graph representation. We developed a rule-based symbolic 'parsing' of the video sequence and a HSMM recognizer. The former is simpler and but the latter can cope better with softer (probabilisitic) evidence. We then compared the two algorithms, recognizing behaviour using the ground-truth tracking, IST feature descriptions and UEDIN role hypothesizing, over 7 context models, 80 sequences and 417 tracked persons. The rule based recognizer achieved 57% and the HSMM recognizer achieved 65% correct recognition of the contexts.

We investigated whether an algorithm based on hard categorial decisions and hand-crafted decision rules would have better or worse recognition results than the HSMM probabilisitic recognizer. We compared the rule-based 'parser' (which allowed some erroneous single frame movement and role classification errors) to the HSMM algorithm, which allowed marginal evidence to be used at a lower probability. In the following, the data is from all ground truth sequences, using the ground truth short term short-term activity and role classifications. True class labels are at the table left edge. The Context label abbreviations are

CW: Walking
CB: Browsing
CI: Immobile
CEn: Shop Enter
CEx: Shop Exit
CR: Shop Re-enter
CWi: Windowshop
CErr: Unrecognizable

Rule-based context recognizer on ground truth

This classifier used hand-tuned rule and procedural based matching algorithms (e.g. like a parser allowing erroneous states) that matched the different context model graphs to the sequence of situations for each video. The sequence of situations were derived from the combination of role and short-term activity labels. Overall 70% of the situations (individual frames) were correctly classified and 57% of the behaviours (context models) were correctly recognised.

. CW CB CI CEn CEx CR CWi CErr Tot %

CW 63470 1103 3366 1149 58 . . 5525 74671 85

CB 656 15934 . 1188 . . . 8780 26558 60

CI 5512 2575 18768 . . 232 . 2704 29791 63

CEn 1048 . . 16785 . 6011 . 2384 26228 64

CEx 371 . . . 10603 . . 26895 37869 28

CR . . . . . . . 2488 2488 0

CWi 67 10139 . . 3766 . . 8601 22573 0

Total . . . . . . . . 220178 57

HSMM context recogniser on ground truth

This classifier used the HSMM matching algorithm to matched the different context model graphs to the sequence of situations for each video. The sequence of situations were derived from the combination of role and short-term activity labels. Overall 74% of the situations (individual frames) were correctly classified and 65% of the behaviours (context models) were correctly recognised. This was slightly better overall than the rule-based approach.

. CW CB CI CEn CEx CR CWi CErr Tot %

CW 65710 1103 4099 368 . . . 3391 74671 88

CB 656 14872 . . . . . 11030 26558 56

CI 191 224 21747 . . . . 7629 29791 73

CEn 1049 . . 16261 . . 17 8918 26228 62

CEx 371 . . . 21206 . . 16292 37869 56

CR . . . . . 528 . 1891 2488 20

CWi . 9565 . . . . 2934 10074 22573 13

Total . . . . . . . . 220178 65

A paper that describes the algorithm is:

D. Tweed, R. Fisher, J. Bins, T. List, "Efficient Hidden Semi-Markov Model Inference for Structured Video Sequences", Proc. 2nd Joint IEEE Int. Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, (VS-PETS), pp 247-254, Beijing, Oct 2005.

Back to CAVIAR home page.

.	CW	CB	CI	CEn	CEx	CR	CWi	CErr	Tot	%
CW	63470	1103	3366	1149	58	.	.	5525	74671	85
CB	656	15934	.	1188	.	.	.	8780	26558	60
CI	5512	2575	18768	.	.	232	.	2704	29791	63
CEn	1048	.	.	16785	.	6011	.	2384	26228	64
CEx	371	.	.	.	10603	.	.	26895	37869	28
CR	.	.	.	.	.	.	.	2488	2488	0
CWi	67	10139	.	.	3766	.	.	8601	22573	0
Total	.	.	.	.	.	.	.	.	220178	57

.	CW	CB	CI	CEn	CEx	CR	CWi	CErr	Tot	%
CW	65710	1103	4099	368	.	.	.	3391	74671	88
CB	656	14872	.	.	.	.	.	11030	26558	56
CI	191	224	21747	.	.	.	.	7629	29791	73
CEn	1049	.	.	16261	.	.	17	8918	26228	62
CEx	371	.	.	.	21206	.	.	16292	37869	56
CR	.	.	.	.	.	528	.	1891	2488	20
CWi	.	9565	.	.	.	.	2934	10074	22573	13
Total	.	.	.	.	.	.	.	.	220178	65