An Unsupervised Topic Segmentation Model Incorporating Word Order
Shoaib Jameel and Wai Lam
The Chinese University of Hong Kong
One line summary of the work We will see how maintaining the document structure such as paragraphs, sentences, and the word order helps improve the performance of a topic model.
Shoaib Jameel and Wai Lam SIGIR-2013, Dublin, Ireland 1 Outline
Motivation Related Work
I Probabilistic Unigram Topic Model (LDA) I Probabilistic N-gram Topic Models I Topic Segmentation Models Overview of our model
I Our N-gram Topic Segmentation model (NTSeg) Text Mining Experiments of NTSeg
I Word-Topic and Segment-Topic Correlation Graph I Topic Segmentation Experiment I Document Classification Experiment I Document Likelihood Experiment Conclusions and Future Directions
Shoaib Jameel and Wai Lam SIGIR-2013, Dublin, Ireland 2 Motivation Many works in the topic modeling literature assume exchangeability among the words. As a result we see many ambiguous words in topics. For example, consider few topics obtained from the NIPS collection using the Latent Dirichlet Allocation (LDA) model: Example
Topic 1 Topic 2 Topic 3 Topic 4 Topic 5 architecture order connectionist potential prior recurrent first role membrane bayesian network second binding current data module analysis structures synaptic evidence modules small distributed dendritic experts
The problem with the LDA model Words in topics are not insightful. Shoaib Jameel and Wai Lam SIGIR-2013, Dublin, Ireland 3 Motivation Many works in the topic modeling literature assume exchangeability among the words. As a result we see many ambiguous words in topics. For example, consider few topics obtained from the NIPS collection using the Latent Dirichlet Allocation (LDA) model: Example
Topic 1 Topic 2 Topic 3 Topic 4 Topic 5 architecture order connectionist potential prior recurrent first role membrane bayesian network second binding current data module analysis structures synaptic evidence modules small distributed dendritic experts
The problem with the LDA model Words in topics are not insightful. Shoaib Jameel and Wai Lam SIGIR-2013, Dublin, Ireland 3 Latent Dirichlet Allocation Model (LDA) (Blei et al., JMLR-2003)
Generative Process Graphical Model in Plate 1 Draw θ from Dirichlet(α), Diagram where each θ(d) consists of topic distribution for document θ α d 2 Draw φ from Dirichlet(β), where φ encompasses word distribution for topic z β 3 For every word in the document d (d) 1 Draw a topic zi from Multinomial (θ(d)) (d) w φ 2 Draw a word wi from Multinomial (φ (d) ) N Z zi M
Shoaib Jameel and Wai Lam SIGIR-2013, Dublin, Ireland 4 Relaxing the Bag-of-Words Assumption in a Topic Model Can the bag-of-words assumption be relaxed in a topic model? This makes more sense as this is how documents are written by humans and also read.
more Here things are lot more dog a organized. I can understand noticed that it was actually the cat I which noticed a dog. lot cat understand don’t things here
Figure : This is how the bag-of-words looks (left) - complete chaos. The one on the right makes more sense to us.
Shoaib Jameel and Wai Lam SIGIR-2013, Dublin, Ireland 5 Relaxing the Bag-of-Words Assumption Bigram Topic Model (BTM) (Wallach, ICML-2006)
Some Properties of the model Graphical Model of BTM
Word is generated by both the α topic and the previous word θ Inspired by the Hierarchical Dirichlet Language Model (d) (d) (d) (d) zi 1 zi zi+1 zi+2 Better empirical results than − the LDA model
(d) (d) (d) (d) w A limitation of the model wi 1 wi wi+1 i+2 − I Always generates bigrams in M a topic
σ δ ZV
Shoaib Jameel and Wai Lam SIGIR-2013, Dublin, Ireland 6 Relaxing the Bag-of-Words Assumption LDA-Collocation Model (LDACOL) (Griffiths et al., Psy. Rev-2007)
Some Properties of the model Graphical Model of LDACOL
Word is generated by the α topic, the previous word and a binary bigram status variable θ Each word has a topic (d) (d) (d) (d) ψ assignment and a collocation zi 1 zi zi+1 zi+2 − V assignment (d) (d) (d) xi xi+1 xi+2 Can form longer order phrases γ (d) (d) (d) (d) wi 1 wi wi+1 wi+2 − Can generate both unigrams M and bigram words β φ σ δ A limitation of the model Z V
I Only the first word in a bigram has a topic assignment
Shoaib Jameel and Wai Lam SIGIR-2013, Dublin, Ireland 7 Relaxing the Bag-of-Words Assumption Topical N-Gram Model (TNG) (Wang et al., ICDM-2007)
Some Properties of the model Graphical Model of TNG
Extends the LDACOL model α Each word has a topic
assignment and a collocation θ assignment (d) (d) (d) (d) ψ zi 1 zi zi+1 zi+2 Can form longer order phrases − ZV
(d) (d) (d) Can generate both unigrams xi xi+1 xi+2 γ and bigram words (d) (d) (d) (d) wi 1 wi wi+1 wi+2 − A limitation of the model M
I Words in a bigram may have β φ σ δ different topic assignments Z ZV
Shoaib Jameel and Wai Lam SIGIR-2013, Dublin, Ireland 8 What Topic N-gram models do - An Illustration
Abstract We give necessary and sufficient conditions for uniqueness of the support vector solution for the problems of 1 Para. pattern recognition and regression estimation, for a general class of cost functions. We sho that if the solution is not unique, all support vectors are necessarily at bound, and we give some simple examples of non-unique solutions. We note that uniqueness of the primal (dual) silution does not necessarily imply uniqueness of the dual (primal) solution. We show how to compute the threshold b when the solution is unique, but when all support vectors are bound, in which ...
aa 2 Para. Acknowledgements C. Burges wishes to thank W. Keasler, V. Lawrence and C. Nohl of Lucent Technologies for their case the usual method for determining b does not work... support. Reference [1] R. Fletcher, Practical Methods of Optimization. John Wiley and Sons, Inc., 2nd edition, 1987.
Topic 1 Topic 2 support vector acknowledgements cost functions reference
Consider the document as a whole Find topical n-grams in the document
Shoaib Jameel and Wai Lam SIGIR-2013, Dublin, Ireland 9 Bag of Words in Topic Segmentation These models maintain the document structure such as paragraphs or sentences Assume that words within a paragraph or a sentence are exchangeable Introduces the notion of segment-topics or super-topics and word-topics
Paragraph n in the document d
follow 1 this is will next 2 paragraph
follow this is 3 paragraph next 2 will
Paragraph n +1 in the document d
Shoaib Jameel and Wai Lam SIGIR-2013, Dublin, Ireland 10 A Topic Segmentation Model (LDSEG) (Shafiei et al., Canadian AI-2008)
Model Properties Graphical Model in Plate Performs topic segmentation Diagram Can work at paragraph and τ ρ sentence level π c a binary variable gives the Ω y c change in topics segment-wise Segments come from a θ α predefined number of super-topics β The super-topics comprise of z a mixture of word-topics w φ
N Z S M
Shoaib Jameel and Wai Lam SIGIR-2013, Dublin, Ireland 11 A Topic Segmentation Model (LDSEG) (Shafiei et al., Canadian AI-2008)
Model Properties Graphical Model in Plate This region is similar to the Diagram LDA model τ ρ Segments exhibit multiple π topics Ω y c Words are generated from a predefined number of word-topics θ α
β z
w φ
N Z S M
Shoaib Jameel and Wai Lam SIGIR-2013, Dublin, Ireland 12 Topic Segmentation Illustration using the LDSEG model
Abstract We give necessary and sufficient conditions for uniqueness of the support vector solution for the problems of pattern recognition and regression estimation, for a general class of cost functions. We sho that if the solution is not 1 Para. unique, all support vectors are necessarily at bound, and we give some simple examples of non-unique solutions. We Super-Topic 7 note that uniqueness of the primal (dual) silution does not necessarily imply uniqueness of the dual (primal) solution. We show how to compute the threshold b when the solution is unique, but when all support vectors are bound, in which ...
Acknowledgements C. Burges wishes to thank W. Keasler, V. Lawrence and C. Nohl of Lucent Technologies for their 2 Para. case the usual method for determining b does not work... Super-Topic 4 support. Reference [1] R. Fletcher, Practical Methods of Optimization. John Wiley and Sons, Inc., 2nd edition, 1987.
Word-Topic 1 Word-Topic 2 Super-Topic 1 Super-Topic 2
support acknowledgements Word-Topic 1 Word-Topic 5 vector reference Word-Topic 9 Word-Topic 8 cost Word-Topic 12 Word-Topic 1
Performs topic segmentation Unigram words are assigned to the word-topics Segments are assigned to the document-topics or super-topics
Shoaib Jameel and Wai Lam SIGIR-2013, Dublin, Ireland 13 Our Research Contributions
We propose a model called, NTSeg Our proposed model maintains the document structure such as paragraphs and sentences Maintains the order of the words in the document Detects and coordinates two topic granularity levels
I Segment-Topics I Word-Topics Derivation of the posterior inference scheme Conducted extensive text mining experiments
I Shown improvement over state-of-the-art models
Shoaib Jameel and Wai Lam SIGIR-2013, Dublin, Ireland 14 Our Proposed Model (NTSeg)
Ω ρ α K
π(d) τ (d)
(d) (d) ys 1 ys −
(d) cs 1 − (d) cs (d) (d) θs 1 θs −
(d) (d) (d) z z z (d) (d) (d) s 1,n 1 s 1,n s 1,n+1 zs,n 1 zs,n zs,n+1 − − − − −
(d) (d) x (d) (d) xs 1,n s 1,n+1 xs,n xs,n+1 − −
(d) (d) (d) (d) (d) (d) ws 1,n 1 ws 1,n ws 1,n+1 w − − − ws,n 1 ws,n s,n+1 − −
M
β φ ψ γ σ δ Z ZV ZV
Shoaib Jameel and Wai Lam SIGIR-2013, Dublin, Ireland 15 Our Proposed Model (NTSeg)
Ω ρ α K
π(d) τ (d)
(d) (d) ys−1 ys
(d) cs−1 (d) cs (d) (d) θs−1 θs
(d) (d) (d) d (d) z − − z − z − (d) ( ) s 1,n 1 s 1,n s 1,n+1 zs,n−1 zs,n zs,n+1
d (d) x( ) (d) (d) xs−1,n s−1,n+1 xs,n xs,n+1
d d ( ) ( ) (d) d (d) ws−1,n−1 ws−1,n w − (d) ( ) s 1,n+1 ws,n−1 ws,n ws,n+1
M
β φ ψ γ σ δ Z ZV ZV
Shoaib Jameel and Wai Lam SIGIR-2013, Dublin, Ireland 16 Few Properties of NTSeg
Segments are assigned to the segment-topics (d) Assume a Markov property on the segment-topics ys (d) cs denotes the segment-topic change-points Segments can be taken as a paragraphs or sentences
Shoaib Jameel and Wai Lam SIGIR-2013, Dublin, Ireland 17 Our Proposed Model (NTSeg)
Ω ρ α K
π(d) τ (d)
(d) (d) ys−1 ys
(d) cs−1 (d) cs (d) (d) θs−1 θs
(d) (d) (d) d (d) z − − z − z − (d) ( ) s 1,n 1 s 1,n s 1,n+1 zs,n−1 zs,n zs,n+1
d (d) x( ) (d) (d) xs−1,n s−1,n+1 xs,n xs,n+1
d d ( ) ( ) (d) d (d) ws−1,n−1 ws−1,n w − (d) ( ) s 1,n+1 ws,n−1 ws,n ws,n+1
M
β φ ψ γ σ δ Z ZV ZV
Shoaib Jameel and Wai Lam SIGIR-2013, Dublin, Ireland 18 Some properties of NTSeg
Does not break the order of the words Can form unigrams, bigrams and higher order phrases (using x) variable The phrases share the same topic
Shoaib Jameel and Wai Lam SIGIR-2013, Dublin, Ireland 19 b b b b b b
θ
d d z( ) (d) z(d) z( ) i−1 zi i+1 i+2
d d x( ) x(d) x( ) i i+1 i+2
d w( ) w(d) w(d) w(d) i−1 i i+1 i+2
Irish cricket team
Word Word Word Topic Topic Topic 7 7 7
x =1 x =1
Irish cricket team
Figure : This is how we form longer phrases
Shoaib Jameel and Wai Lam SIGIR-2013, Dublin, Ireland 20 Technical Challenges for NTSeg
Sharing of the same word-topic among words in a phrase Coordination of segment-topics and word-topics Derivation of the Posterior Inference scheme
I Gibbs sampling update equations
Shoaib Jameel and Wai Lam SIGIR-2013, Dublin, Ireland 21 Posterior Inference Gibbs Sampling
Sampling word-topic assignments
P(z(d), x(d) w, z(d), x(d), y, c, α, β, γ, δ, ρ, Ω) si si | ¬si ¬si ∝ (d) (α (d) (d) + h 1) (γ (d) + p (d) (d) (d) 1) y z (d) x z w x s si szsi − × si s,i−1 s,i−1 si − β (d) +n (d) (d) −1 w z w ( ) si si si if x d = 0 PV si βv +n (d) −1 v=1 z v si δ (d) +m (d) (d) (d) −1 (1) × w w w z si si s,i−1 si if x(d) = 1 & z(d) = z(d) PV si si s,i−1 δv +m (d) (d) −1 v=1 w vz s,i−1 si
Shoaib Jameel and Wai Lam SIGIR-2013, Dublin, Ireland 22 Posterior Inference Gibbs Sampling
Sampling segment-topic assignments
P(y (d), c(d) z, y (d), c(d), w, x, α, β, γ, δ, ρ, Ω) s s | ¬s ¬s ∝ (d) (d) (ρ (d) + b 1) (α (d) (d) + h 1) y (d) y z (d) s ys − × s si szsi − × (d) κ +Ω0 ( ) cs,0 d P1 (d) if cs = 0 x=0 κcs,x +Ω0+Ω1 (d) (2) ( ) κ +Ω1 d cs,1 (α (d) (d) + h (d) 1) 1 (d) ys z sz P κ +Ω +Ω si si − × x=0 cs,x 0 1 (d) (d) (d) if cs = 1 & s > 1 & ys = y(s−1)
Shoaib Jameel and Wai Lam SIGIR-2013, Dublin, Ireland 23 NTSeg Word-Topic and Segment-Topic Illustration
Abstract We give necessary and sufficient conditions for uniqueness of the support vector solution for the problems of pattern recognition and regression estimation, for a general class of cost functions. We sho that if the solution is not 1 Para. unique, all support vectors are necessarily at bound, and we give some simple examples of non-unique solutions. We Segment Topic 7 note that uniqueness of the primal (dual) silution does not necessarily imply uniqueness of the dual (primal) solution. We show how to compute the threshold b when the solution is unique, but when all support vectors are bound, in which ...
Acknowledgements C. Burges wishes to thank W. Keasler, V. Lawrence and C. Nohl of Lucent Technologies for their 2 Para. case the usual method for determining b does not work... Segment Topic 4 support. Reference [1] R. Fletcher, Practical Methods of Optimization. John Wiley and Sons, Inc., 2nd edition, 1987.
Word-Topic 1 Word-Topic 2 Segment-Topic 1 Segment-Topic 2
support vector acknowledgements Word-Topic 1 Word-Topic 5 cost functions reference Word-Topic 9 Word-Topic 8 Word-Topic 12 Word-Topic 1
Performs document segmentation based on topic N-gram words are assigned to the word-topics Segments are assigned to the segment-topics
Shoaib Jameel and Wai Lam SIGIR-2013, Dublin, Ireland 24 Word-topic and Segment-topic Correlation Graph
Used a large dataset - OHSUMED
I OHSUMED consists of 348,566 medical abstracts The idea is to show the discovery of n-gram words of topics via the correlation graph
Shoaib Jameel and Wai Lam SIGIR-2013, Dublin, Ireland 25 Word-Topic and Segment-Topic Correlation Graph Result of NTSeg
hiv human immunodeficiency virus weeks fetal aids umbilical artery infected fetal heart infection adult vaccine protection antibody response immunization protective
method medical results obtained health children system family physicians child clinical laboratory primary care sexual abuse methods family pratice adults young children women infants gestational age confidence interval minor risk pregnant woman men preterm infants risk factors
Shoaib Jameel and Wai Lam SIGIR-2013, Dublin, Ireland 26 Topic Correlation Graph Correlation Graph from GD-LDA (Caballero et al., CIKM-2012) study computer patient make people internet hand universe information system disease technology tv medical service people increase people program busy work drug state bank problem united economy talk system story nato investor journal clinton percent time american price editor military budget war york nuclear president politic chechnya power
Shoaib Jameel and Wai Lam SIGIR-2013, Dublin, Ireland 27 Topic Segmentation Experiment (d) The ordering of cs gives the topic change-points in the document Used two benchmark datasets - Books and Lectures datasets
I Books dataset - Medical text book, 140 sentences, 227 chapters I Lectures dataset - Undergraduate lecture recording of Physics and AI classes, 90 min lecture, 700 sentences, 8500 words A segment here is a sentence Comparative method - TopicTiling Algorithm (Reidl et al, ACL-2012) Used two commonly used evaluation metrics
I Pk - Probability that the two segments drawn randomly from a document are incorrectly identified as belonging to the same topic I WinDiff - Moves a sliding window across the text and counts the number of times the hypothesized and referenced segment boundaries are different from within the window These two evaluation metrics give an error estimate, so the lower, the better
Shoaib Jameel and Wai Lam SIGIR-2013, Dublin, Ireland 28 Topic Segmentation Results Books dataset 10 Segment-Topics (K) 30 Segment-Topics (K) 50 Segment-Topics (K)
0.330 0.330 0.330 0.325 0.325 0.325 0 320 Pk Pk 0.320 Pk . 0.320 0.315 0.315 0.310 0.310 0.315 20 40 60 80 100 20 40 60 80 100 20 40 60 80 100 Word-Topics (Z) Word-Topics (Z) Word-Topics (Z) 10 Segment-Topics (K) 30 Segment-Topics (K) 50 Segment-Topics (K)
0.350 0.350 0.350 0.345 0.345 0.340 0.340 0.340 WinDiff WinDiff WinDiff 0.335 0.330 0.335 0.330 0.330 20 40 60 80 100 20 40 60 80 100 20 40 60 80 100 Word-Topics (Z) Word-Topics (Z) Word-Topics (Z)
Shoaib Jameel and Wai Lam SIGIR-2013, Dublin, Ireland 29 Topic Segmentation Results Lectures dataset 10 Segment-Topics (K) 30 Segment-Topics (K) 50 Segment-Topics (K)
0.380 0.380 0.380 0.375 0 375 0.370 . 0.370 Pk Pk Pk 0.370 0 365 . 0.360 0.360 0.365 0.355 0.350 0.360 20 40 60 80 100 20 40 60 80 100 20 40 60 80 100 Word-Topics (Z) Word-Topics (Z) Word-Topics (Z) 10 Segment-Topics (K) 30 Segment-Topics (K) 50 Segment-Topics (K) 0.460 0.460 0.461 0.455 0.458 0.455 0.450 0.455 0.452 0.450 0.445 0.449 WinDiff WinDiff 0.440 WinDiff 0.445 0.435 0.446 0.443 0.440 0.430 20 40 60 80 100 20 40 60 80 100 20 40 60 80 100 Word-Topics (Z) Word-Topics (Z) Word-Topics (Z)
Shoaib Jameel and Wai Lam SIGIR-2013, Dublin, Ireland 30 Document Classification Experiment Dataset
Generate four datasets from 20 Newsgroups data The datasets are:
I Computer I Politics I Sports I Science Each dataset comprises of equal number of documents of several classes. For example, the Computer dataset consists of the following classes:
I Graphics I Hardware I X Windows I Mac I Microsoft Windows
Shoaib Jameel and Wai Lam SIGIR-2013, Dublin, Ireland 31 Document Classification Experiment Experimental Setup
Split each dataset into training and test set maintaining the class distribution
I We used 75% training and 25% testing in our experiments For each class, we generate a topic model using the training set During classification, compute the likelihood of each document in the test set in each topic model The test document gets classified to that class where the likelihood is maximum Evaluation Metrics
I Standard Precision, Recall and F-Measure for each class I Adopted Macro-Averaging scheme
Shoaib Jameel and Wai Lam SIGIR-2013, Dublin, Ireland 32 Document Classification Experiment Comparative Methods
Latent Dirichlet Segmentation Method (Word-Topics and Super-Topics) - LDSEG (Shafiei et al., Canadian AI-2008) Pachinko Allocation Model (Super-Topics and Word-Topics) - PAM (Li and McCallum, ICML-2006) LDA Collocation Model (N-gram Topic Model) - LDACOL (Griffiths et al., Psy. Rev-2007) Topical N-gram Model (N-gram Topic Model) - TNG (Wang et al., ICDM-2007) Phrase Discovery Topic Model based on Pitman-Yor Process - PDLDA (Lindsey et al., EMNLP-2012)
Shoaib Jameel and Wai Lam SIGIR-2013, Dublin, Ireland 33 Document Classification Experiment Results
Precision Recall F-Measure Precision Recall F-Measure LDSEG 0.580 0.420 0.487 0.440 0.400 0.419 PAM 0.550 0.450 0.495 0.500 0.330 0.398 LDACOL 0.400 0.300 0.343 0.420 0.370 0.393 TNG 0.490 0.420 0.452 0.560 0.470 0.511 PDLDA 0.580 0.500 0.537 0.580 0.510 0.543 NTSeg 0.640 0.520 0.574 0.620 0.560 0.588 Computer dataset Science dataset Precision Recall F-Measure Precision Recall F-Measure LDSEG 0.390 0.320 0.352 0.330 0.320 0.325 PAM 0.540 0.490 0.514 0.368 0.360 0.363 LDACOL 0.550 0.410 0.470 0.200 0.180 0.189 TNG 0.550 0.450 0.495 0.340 0.290 0.313 PDLDA 0.590 0.410 0.484 0.380 0.210 0.271 NTSeg 0.620 0.570 0.594 0.420 0.380 0.399 Politics dataset Sports dataset
Shoaib Jameel and Wai Lam SIGIR-2013, Dublin, Ireland 34 Document Modeling Experiment
NIPS dataset 10 Segment-Topics (K) 106 50 Segment-Topics (K) · 106 · 8.600 − 8.600 − 8.700 8.700 − −
Log-Likelihood 8.800 8.800 Log-Likelihood − − 50 100 150 200 50 100 150 200 Word-Topics (Z) Word-Topics (Z)
Figure : NTSeg ( ) LDSEG ( ), PAM ( ), LDACOL ( ), TNG ( ), and PDLDA ( ).
Shoaib Jameel and Wai Lam SIGIR-2013, Dublin, Ireland 35 Document Modeling Experiment Results
OHSUMED dataset (348,566 medical abstracts) 50 Segment-Topics (K) 150 Segment-Topics (K) 107 107 3.150 · · − 3.150 −
3.200 3.200 − −
3.250 .
Log-Likelihood Log-Likelihood 3 250 − −
3.300 3.300 − 200 300 400 500 − 200 300 400 500 Word-Topics (Z) Word-Topics (Z)
Figure : NTSeg ( ) LDSEG ( ), PAM ( ), LDACOL ( ), TNG ( ), and PDLDA ( ).
Shoaib Jameel and Wai Lam SIGIR-2013, Dublin, Ireland 36 Concluding Remarks
We have presented a topic segmentation model that:
I Maintains the document structure such as paragraphs and sentences I Keeps the order of the words intact We have applied our model in multitudes of text mining tasks
I We have obtained good improvement over the state-of-the-art models
Future Direction: Nonparametric topic segmentation model We wish to automatically find out the number of latent topics that describes the collection instead of manually supplying and varying the number of latent topics
Shoaib Jameel and Wai Lam SIGIR-2013, Dublin, Ireland 37 Acknowledgments
Specially thank SIGIR (the organization) for the Student Travel Award and also the local organizers for waiving my student registration fee (and for helping me keep busy for about 8 hours during the conference time). The experience has been rewarding.
Shoaib Jameel and Wai Lam SIGIR-2013, Dublin, Ireland 38 References
Blei, D. M., Ng, A. Y., and Jordan, M. I. (2003). Latent dirichlet allocation. JMLR, 3, 993-1022. Wallach, H. M. (2006). Topic modeling: Beyond bag-of-words. Proc of ICML (pp. 977-984). Griffiths, T. L., Steyvers, M., and Tenenbaum, J. B. (2007). Topics in semantic representation. Psychological review, 114(2), 211. Wang, X., McCallum, A., and Wei, X. (2007). Topical n-grams: Phrase and topic discovery, with an application to information retrieval. In Proc. of ICDM, (pp. 697-702). Caballero, K. L., Barajas, J., and Akella, R. (2012). The generalized dirichlet distribution in enhanced topic detection. In Proc. of CIKM, (pp. 773-782). Riedl, M., and Biemann, C. (2012). Topictiling: a text segmentation algorithm based on LDA. In Proc. of ACL, (pp. 37-42). Lindsey, R V., William P. H. , and Michael J. S. (2012) A phrase-discovering topic model using hierarchical Pitman-Yor processes. In Proc. of EMNLP, (pp. 214-222). Li, W., and McCallum, A. (2006). Pachinko allocation: DAG-structured mixture models of topic correlations. In Proc. of ICML (pp. 577-584).
Shoaib Jameel and Wai Lam SIGIR-2013, Dublin, Ireland 39