How hmm can be used with deep learning for text processing?
Answers
There are a number of estimation problems that can be solved using the HMM framework including likelihood evaluation, HMM parameter and state estimation. The algorithms used for this are different. My understanding of recrusive neural networks is that you replace the memoryless input (weighted sum) input to the activation function at each node by a difference equation, thereby introducing memory into the processing. The output of such a system can be interpreted as a probability (if using a sigmoid function) and thus the network can be used for probabilistic classification tasks. If your problem is to tag a time series with the most likely states from a finite set, then HMMs can certainly be used. I'm not sure how well RNNs work for such problems but I imagine there are unanswered questions to do with training/convergence and thus how much data is required to get a given level of performance. For HMM's, there is a lot of literature and the model is well understood since based on discrete-time Markov chains. You can choose your states and transition matrix structure to represent the process you are modelling (e.g. speech signals). The only catch with HMMs is that the complexity of the algorithms (forward backward and Viterbi) is basically the square of the number of discrete states. So while you may be able to write down the HMM for a given problem, you may find it hard to implement if you have too many states.