登录
首页 » Others » raw

raw

于 2021-01-06 发布
0 190
下载积分: 1 下载次数: 19

代码说明:

说明:  10个中文分词数据集,用于训练中文分词模型(Ten Chinese Word Segmentation Datasets for Training Chinese Word Segmentation Model)

文件列表:

raw, 0 , 2019-02-10
raw\other, 0 , 2019-02-10
raw\other\zx, 0 , 2019-02-10
raw\other\zx\test.zhuxian.wordpos, 280885 , 2019-02-10
raw\other\zx\train.zhuxian.wordpos, 559793 , 2019-02-10
raw\other\zx\dev.zhuxian.wordpos, 166113 , 2019-02-10
raw\other\cnc, 0 , 2019-02-10
raw\other\cnc\dev.txt, 5581923 , 2019-02-10
raw\other\cnc\train.txt, 44824963 , 2019-02-10
raw\other\cnc\test.txt, 5571735 , 2019-02-10
raw\other\udc, 0 , 2019-02-10
raw\other\udc\dev.conll, 422116 , 2019-02-10
raw\other\udc\test.conll, 400684 , 2019-02-10
raw\other\udc\train.conll, 3282103 , 2019-02-10
raw\other\wtb, 0 , 2019-02-10
raw\other\wtb\dev.conll, 49336 , 2019-02-10
raw\other\wtb\test.conll, 49702 , 2019-02-10
raw\other\wtb\train.conll, 393054 , 2019-02-10
raw\other\sxu, 0 , 2019-02-10
raw\other\sxu\train.txt, 3600697 , 2019-02-10
raw\other\sxu\test.txt, 776035 , 2019-02-10
raw\other\ctb, 0 , 2019-02-10
raw\other\ctb\ctb6.dev.seg, 300375 , 2019-02-10
raw\other\ctb\ctb6.train.seg, 4030528 , 2019-02-10
raw\other\ctb\ctb6.test.seg, 312025 , 2019-02-10
raw\sighan2005, 0 , 2019-02-10
raw\sighan2005\cityu_test_gold.utf8, 239427 , 2019-02-10
raw\sighan2005\msr_training.utf8, 16804586 , 2019-02-10
raw\sighan2005\cityu_training.utf8, 8499903 , 2019-02-10
raw\sighan2005\as_test_gold.utf8, 711891 , 2019-02-10
raw\sighan2005\pku_test_gold.utf8, 716386 , 2019-02-10
raw\sighan2005\as_training.utf8, 30558193 , 2019-02-10
raw\sighan2005\msr_test_gold.utf8, 762801 , 2019-02-10
raw\sighan2005\pku_training.utf8, 7709182 , 2019-02-10

下载说明:请别用迅雷下载,失败请重下,重下不扣分!

发表评论

0 个回复

  • NewWebSite
    HTML/CSS responsive website. Free to personal or profesional use. If user want to develop or help, contact me.
    2017-02-28 20:25:00下载
    积分:1
  • usart
    this is some code fjhg ldfh ldfgh ldfkjhg dlkjfhg dlkf gdlkf g
    2017-05-25 05:34:16下载
    积分:1
  • 4305685
    应用中文分词源码程序,结合易语言模块彗星HTTP应用模块.ec,实现中文分词的效果。(Application of Chinese Word source program, combined with easy language module Comet HTTP application modules .ec, realize the effect of the Chinese word .)
    2017-01-11 23:13:31下载
    积分:1
  • txtLine
    Vb 读取文本数据,每次一行一行显示,以及对文本字符串的分割。(read text data, each party and his party, and the text string segmentation.)
    2006-11-28 17:04:41下载
    积分:1
  • CIPP_JSsetup
    可以实现自动分词功能,支持自动标引,是处理中文自然语言的良好工具(Can achieve automatic word segmentation function, support for automatic indexing is a good tool to deal with Chinese natural language)
    2020-09-24 19:27:48下载
    积分:1
  • JAVAe-book
    MVC构架,JAVA电子留言簿,又喜欢的可以下载(MVC framework, JAVA E-book, but also like to download)
    2008-05-14 13:23:49下载
    积分:1
  • zhijiehanhua
    Directly tool which sinicizes the software
    2010-07-10 20:00:59下载
    积分:1
  • 1234568
    中文信息逆向分词程序 是用api实现的(Chinese Information reverse segmentation process is achieved by api)
    2008-12-20 22:47:31下载
    积分:1
  • VisualC
    在Visual C~(++)中使用Unicode编程,世界上有数百种用计算机指定一个数字,来储存字母或其他字符的编码系统。(In Visual C ~(++) use Unicode programming, there are hundreds of the world, with a number assigned to the computer to store letters or other characters in the coding system.)
    2010-09-03 11:47:29下载
    积分:1
  • 12
    说明:  全新图片防盗链全能后台版 for PW5.X 正式版(GBK、BIG5、UTF8一起发) 说明: 1、所有参数均可后台设置,没有任何功能限制。 2、支持完全防盗链和当天有效两种模式,禁止盗链时显示设定的图片。 3、允许自定义允许链接的域名,自定义防盗链图片地址。(The new version of the background image anti-hotlinking Almighty for PW5.X official version (GBK, BIG5, UTF8 hair together): 1, all parameters can be set back, without any functional limitations. 2, supports full security chain and effective the same day in two modes, the display setting of the pictures is prohibited hotlinking. 3, allows custom links allows domain name, custom anti-hotlinking image address. )
    2016-06-29 21:59:33下载
    积分:1
  • 696524资源总数
  • 103771会员总数
  • 43今日下载