Created attachment 182388 [details] new port shar file SCWS (Simple Chinese Word Segmentation) is a frequency dictionary based Chinese word segmentation engine, it can cut a whole section of the Chinese text into words. Word is the smallest unit of morpheme in Chinese, but in Chinese words are not separated by spaces,so word segmentation is an important step for Chinese language process.SCWS is written in C without other dependencies and accept GBK and UTF-8 encoding for both the Simple Chinese (zh_CN) and the Traditional Chinese (such as zh_TW). WWW: http://www.xunsearch.com/scws/index.php
Comment on attachment 182388 [details] new port shar file ># This is a shell archive. Save it in a file, remove anything before ># this line, and then unpack it by entering "sh file". Note, it may ># create directories; files and directories will be owned by you and ># have default permissions. ># ># This archive contains: ># ># scws ># scws/distinfo ># scws/tmp ># scws/Makefile ># scws/pkg-descr ># scws/pkg-plist ># >echo c - scws >mkdir -p scws > /dev/null 2>&1 >echo x - scws/distinfo >sed 's/^X//' >scws/distinfo << '9894864824fc6e3b607eae66f59e52c9' >XTIMESTAMP = 1494223276 >XSHA256 (scws-1.2.3.tar.bz2) = 60d50ac3dc42cff3c0b16cb1cfee47d8cb8c8baa142a58bc62854477b81f1af5 >XSIZE (scws-1.2.3.tar.bz2) = 485903 >9894864824fc6e3b607eae66f59e52c9 >echo x - scws/tmp >sed 's/^X//' >scws/tmp << 'c21a431cd890495b6974063b9bc51dc2' >X/you/have/to/check/what/makeplist/gives/you >Xbin/scws >Xbin/scws-gen-dict >X%%ETCDIR%%/rules.ini.sample >X%%ETCDIR%%/rules.utf8.ini.sample >X%%ETCDIR%%/rules_cht.utf8.ini.sample >Xinclude/scws/charset.h >Xinclude/scws/crc32.h >Xinclude/scws/darray.h >Xinclude/scws/pool.h >Xinclude/scws/rule.h >Xinclude/scws/scws.h >Xinclude/scws/version.h >Xinclude/scws/xdb.h >Xinclude/scws/xdict.h >Xinclude/scws/xtree.h >Xlib/libscws.la >Xlib/libscws.so >Xlib/libscws.so.1 >Xlib/libscws.so.1.1.0 >c21a431cd890495b6974063b9bc51dc2 >echo x - scws/Makefile >sed 's/^X//' >scws/Makefile << '1605fec3e0a421cd44a8dc7da17f49ed' >X# Created by: Jov <amutu@amutu.com> >X# $FreeBSD$ >X >XPORTNAME= scws >XPORTVERSION= 1.2.3 >XCATEGORIES= textproc >XMASTER_SITES= http://www.xunsearch.com/scws/down/ >X >XMAINTAINER= amutu@amutu.com >XCOMMENT= Simple Chinese word segmentation program and lib >X >XLICENSE= BSD2CLAUSE >X >XGNU_CONFIGURE= yes >XUSES= gmake libtool:keepla tar:bzip2 >XUSE_LDCONFIG= yes >X >XCONFIGURE_ARGS= --sysconfdir=${PREFIX}/etc/scws \ >X --with-pic >X >XINSTALL_TARGET=install-strip >X >Xpost-install: >X ${MV} ${STAGEDIR}${PREFIX}/etc/scws/rules.ini \ >X ${STAGEDIR}${PREFIX}/etc/scws/rules.ini.sample >X ${MV} ${STAGEDIR}${PREFIX}/etc/scws/rules.utf8.ini \ >X ${STAGEDIR}${PREFIX}/etc/scws/rules.utf8.ini.sample >X ${MV} ${STAGEDIR}${PREFIX}/etc/scws/rules_cht.utf8.ini \ >X ${STAGEDIR}${PREFIX}/etc/scws/rules_cht.utf8.ini.sample >X >X.include <bsd.port.mk> >1605fec3e0a421cd44a8dc7da17f49ed >echo x - scws/pkg-descr >sed 's/^X//' >scws/pkg-descr << '31f5a13c77d8f428f238ab0d8084dfb9' >XSCWS (Simple Chinese Word Segmentation) is a frequency dictionary based Chinese >Xword segmentation engine, it can cut a whole section of the Chinese text into >Xwords. Word is the smallest unit of morpheme in Chinese, but in Chinese words >Xare not separated by spaces,so word segmentation is an important step for >XChinese language process.SCWS is written in C without other dependencies and >Xaccept GBK and UTF-8 encoding for both the Simple Chinese (zh_CN) and the >XTraditional Chinese (such as zh_TW). >X >XWWW: http://www.xunsearch.com/scws/index.php >31f5a13c77d8f428f238ab0d8084dfb9 >echo x - scws/pkg-plist >sed 's/^X//' >scws/pkg-plist << '8e14e730a1dd29627c91dc2cf0df2327' >Xbin/scws >Xbin/scws-gen-dict >X%%ETCDIR%%/rules.ini.sample >X%%ETCDIR%%/rules.utf8.ini.sample >X%%ETCDIR%%/rules_cht.utf8.ini.sample >Xinclude/scws/charset.h >Xinclude/scws/crc32.h >Xinclude/scws/darray.h >Xinclude/scws/pool.h >Xinclude/scws/rule.h >Xinclude/scws/scws.h >Xinclude/scws/version.h >Xinclude/scws/xdb.h >Xinclude/scws/xdict.h >Xinclude/scws/xtree.h >Xlib/libscws.la >Xlib/libscws.so >Xlib/libscws.so.1 >Xlib/libscws.so.1.1.0 >8e14e730a1dd29627c91dc2cf0df2327 >exit >
Hi, Martin, I am working on another new port which depends on this port, what can I do to accelerate accept process for this port?
ping
Assignee timeout. Give back to the pool.
1) I suspect the file "tmp" could be removed? 2) i see some *.sample files in %%ETCDIR%%. Are the usable as they are? In this case we should mark them with @sample. We always should aim to make the usage of the port as easy as possible. If there is reasonable default-config we just should use it. How can i test the port? Me kanji understanding is far to worse to get the instruction.
Created attachment 184341 [details] scws.shar do not change the rule file name.
Created attachment 184342 [details] test.sh test this PR: set your env(assume csh): setenv LANG zh_CN.UTF-8 ./test.sh will show: env ok test the lib: test ok! scws-dict-chs-utf8.tar.bz2 100% of 3994 kB 1054 kBps 00m03s x dict.utf8.xdb test the scws cmd: FreeBSD/en 是/v 一个/m 伟大/a 的/uj 操作系统/l +--[scws(scws-cli/1.2.3)]----------+ | TextLen: 37 | | Prepare: 0.0012 (sec) | | Segment: 0.0002 (sec) | +--------------------------------+
Thanks! Committed! :)
A commit references this bug: Author: tz Date: Mon Jul 17 10:16:05 UTC 2017 New revision: 446058 URL: https://svnweb.freebsd.org/changeset/ports/446058 Log: New port: textproc/scws SCWS (Simple Chinese Word Segmentation) is a frequency dictionary based Chinese word segmentation engine, it can cut a whole section of the Chinese text into words. Word is the smallest unit of morpheme in Chinese, but in Chinese words are not separated by spaces,so word segmentation is an important step for Chinese language process.SCWS is written in C without other dependencies and accept GBK and UTF-8 encoding for both the Simple Chinese (zh_CN) and the Traditional Chinese (such as zh_TW). WWW: http://www.xunsearch.com/scws/index.php PR: 219132 Submitted by: Jov <amutu@amutu.com> Changes: head/textproc/Makefile head/textproc/scws/ head/textproc/scws/Makefile head/textproc/scws/distinfo head/textproc/scws/pkg-descr head/textproc/scws/pkg-plist