Main

type

5 (blog/news article)

status

21 (imported old-v2, waiting for another import)

review version

0

cleanup version

0

pending deletion

0 (-)

created at

2025-11-08 18:05:28

updated at

2025-11-08 18:05:28

Address

url

http://blog.so8848.com/2010/07/collapsed-gibbs-sampling-for-lda-and.html

url length

72

url crc

24113

url crc32

3406847537

location type

1 (url matches target location, page_location is empty)

canonical status

2 (missing canonical tag in html)

canonical page id

-

Source

domain id

498474261

domain tld

2211

domain parts

0

originating warc id

-

originating url

https://data.commoncrawl.org/crawl-data/CC-MAIN-2025-33/segments/1754151280076.69/warc/CC-MAIN-20250809045158-20250809075158-00663.warc.gz

source type

11 (CommonCrawl)

Server response

server ip

142.251.167.121

Publication date

2025-08-09 06:01:49

Fetch attempts

0

Original html size

65286

Normalized and saved size

26748

Content

title

Collapsed Gibbs Sampling for LDA and Bayesian Naive Bayes | Information Retrieval Blog

excerpt

content

    Sent to you by Jeff via Google Reader:     Collapsed Gibbs Sampling for LDA and Bayesian Naive Bayes via LingPipe Blog by lingpipe on 7/13/10 I've uploaded a short (though dense) tech report that works through the collapsing of Gibbs samplers for latent Dirichlet allocation (LDA) and the Bayesian formulation of naive Bayes (NB). Carpenter, Bob. 2010. Integrating out multinomial parameters in latent Dirichlet allocation and naive Bayes for collapsed Gibbs sampling. LingPipe Technical Report. Thomas L. Griffiths and Mark Steyvers used the collapsed sampler for LDA in their (old enough to be open access) PNAS paper, Finding scientific topics. They show the final form, but don't derive the integral or provide a citation. I suppose these 25-step integrals are supposed to be child's play. Maybe they are if you're a physicist or theoretical statistician. But that was a whole lot of choppin' with the algebra and the calculus for a simple country computational linguist like me. ...

author

updated

1764198227

Text analysis

block type

0

extracted fields

104

extracted bits

title
full content
content was extracted heuristically

detected location

0

detected language

1 (English)

category id

Pozostałe (16)

index version

2025110801

paywall score

0

spam phrases

0

Text statistics

text nonlatin

0

text cyrillic

0

text characters

4039

text words

846

text unique words

408

text lines

1

text sentences

43

text paragraphs

1

text words per sentence

19

text matched phrases

0

text matched dictionaries

0