-
潜在狄利克雷分布主题模型 LDA
LDA是一种文档主题生成模型,也称为一个三层贝叶斯概率模型,包含词、主题和文档三层结构。文档到主题服从Dirichlet分布,主题到词服从多项式分布。
LDA是一种非监督机器学习技术,可以用来识别大规模文档集(document collection)或语料库(corpus)中潜藏的主题信息。它采用了词袋(bag of words)的方法,这种方法将每一篇文档视为一个词频向量,从而将文本信息转化为了易于建模的数字信息。但是词袋方法没有考虑词与词之间的顺序,这简化了问题的复杂性,同时也为模型的改进提供了契机。每一篇文档代表了一些主题所构成的一个概率分布,而每一个主题又代表了很多单词所构成的一个概率分布。
对于语料库中的每篇文档,LDA定义了如下生成过程(generative process):
1. 对每一篇文档,从主题分布中抽取一个主题;
2. 从上述被抽到的主题所对应的单词分布中抽取一个单词;
3. 重复上述过程直至遍历文档中的每一个单词。
- 2022-03-16 01:27:42下载
- 积分:1
-
抓取百度美女吧的图片, 存成本地. 并且支持断点继续下载
抓取百度美女吧的图片, 存成本地. 并且支持断点继续下载-Baidu beauty bar crawl pictures, deposit cost. And to support the breakpoint to continue downloading
- 2022-01-31 14:37:23下载
- 积分:1
-
Linux socket programming example actual source code
实战Linux socket编程例题源代码-Linux socket programming example actual source code
- 2022-02-01 06:21:11下载
- 积分:1
-
快速内存读写算法
快速内存读写算法-Fast Algorithm for memory read and write
- 2022-02-02 10:36:17下载
- 积分:1
-
unix下五笔练习软件
unix下五笔练习软件-practice software
- 2022-02-05 17:31:14下载
- 积分:1
-
key_driver007_select.tar
两个key驱动程序,高级IO驱动演示,应用程序增加select功能,驱动程序增加poll支持,用于演示select函数。(Two key drivers, high-level IO-driven presentations, select the application to increase functionality, increase the poll support for the driver, select a function for the presentation.)
- 2009-12-02 20:51:54下载
- 积分:1
-
Image Viewer 1.2 is a Gtk sample application for show pictures
Image Viewer 1.2 is a Gtk sample application for show pictures
- 2022-06-16 05:02:27下载
- 积分:1
-
多进程例子程序,IO操作在不同的进程中!
多进程例子程序,IO操作在不同的进程中!-Examples of multi-process procedures, IO operations in a different process!
- 2022-01-26 08:35:16下载
- 积分:1
-
The Packet Debugger, pdb is a program which allows people to work with
packet...
The Packet Debugger, pdb is a program which allows people to work with
packet streams as if they were working with a source code debugger. Users
can list, inspect, modify, and retransmit any packet from captured files as
well as work with live packet capture.-The Packet Debugger, Palm is a program which allows people to work with packet streams as if they were working with a sou rce code debugger. Users can list, inspect, modify, and retransmit any packet from a captured files s well as work with live packet capture.
- 2022-03-23 12:24:25下载
- 积分:1
-
简易计算器
用C语言写的,里面使用数据结构的链表,数据栈,运算符栈来组成的简易的计算器源码。简易计算器里面只有“+”,
“-”、“*”、“/”、“%”等运算符,数据也有正负之分。
- 2022-07-17 02:44:18下载
- 积分:1