Centos Linux下安装coreseek/sphinx 全过程

2009年5月16日 由 LEO 留言 »

为了完成亿枝客服务器顺利转到Linux上,特作了这个测试工作,以下是测试成功的全过程。

环境准备:

centos5.2

xampp1.7.1 (Apache 2.2.11, MySQL 5.1.33, PHP 5.2.9)

coreseek3.1b3 / Sphinx 0.9.9-rc1

一、Centos安装

这步就省略了,有点要注意,在安装过程中一定全部选择服务器里的开发工具与开发库,如果忘记选择的话,就要在安装之前执行以下命令:

yum install gcc

yum install gcc-c++

yum install python

yum install python-devel

yum install gtk+

yum install libtool

yum install automake
yum install autoconf

yum install mysql-devel

二、xampp安装

参考http://www.apachefriends.org/zh_cn/xampp-linux.html#1677

三、coreseek安装

cd /opt/software

下载mmseg

wget http://www.coreseek.com/uploads/sources/mmseg3_0b3.tar.gz

下载coreseek
wget http://www.coreseek.com/uploads/sources/csft3.1b3.tar.gz

解压缩两个文件包
tar -xzvf mmseg3_0b3.tar.gz
tar -xzvf csft3.1b3.tar.gz

编译mmseg
cd /opt/software/mmseg.3_0b3
./configure –prefix=/usr/local/mmseg
make
make install

注意:如果在这一步出错,且出错提示为:css/UnigramCorpusReader.cpp:89: error: ’strncmp’ was not declared in this scope
则需手工编辑.src/css目录下UnigramCorpusReader.cpp 文件,在其第一行加上

#include <string.h>
然后执行make clean 再重新 make,make install即可通过

在这一步安装完了后,将会在 /usr/local 下产生 mmseg目录,
手工修改 /usr/local/mmseg/include/mmseg/freelist.h
vi   /usr/local/mmseg/include/mmseg/freelist.h
在上面添加
#include <string.h>

编译coreseek
cd  /opt/software/csft3.1b3/
到这里都一切正常;下面就开始可能有问题出现
第一configure

./configure –prefix=/usr/local/coreseek –with-python –with-mysql –with-mmseg-includes=/usr/local/mmseg/include/mmseg –with-mmseg-libs=/usr/local/mmseg/lib/

make

make install

可能的错误一:

pydatasource.cpp:742: 错误:从类型 ‘const char*’ 到类型 ‘char*’ 的转换无效
pydatasource.cpp:742: 错误:  初始化实参 2,属于 ‘PyObject* PyObject_GetAttrString(PyObject*, char*)’
make[2]: *** [pydatasource.o] 错误 1
make[2]: Leaving directory `/opt/csft3.1b3/src’
make[1]: *** [all] 错误 2
make[1]: Leaving directory `/opt/csft3.1b3/src’
make: *** [all-recursive] 错误 1

解决办法:yum install python-devel  或者 去了–with-python

可能错误二:

sphinxutils.cpp:793: error: cannot convert ‘int*’ to ‘Py_ssize_t*’ for argument ‘2’ to ‘int PyDict_Next(PyObject*, Py_ssize_t*, PyObject**, PyObject**)’
sphinxutils.cpp:802: warning: unused variable ‘nRet’
make[2]: *** [sphinxutils.o] 错误 1
make[2]:正在离开目录 `/home/syu/sphinx/csft3_0b4/src’
make[1]: *** [all] 错误 2
make[1]:正在离开目录 `/home/syu/sphinx/csft3_0b4/src’
make: *** [all-recursive] 错误 1

解决办法:手工打开 src目录下的 sphinxutils.cpp 修改第789行左右int pos = 0; 修改为   Py_ssize_t pos = 0;

执行make clean 重新make & make install 即可

此步安装完成后,将在/usr/local/下生成 coreseek 目录

四、重新编译xampp中MySql,支持sphinxse

应用coreseek/sphinx,有两种办法,一是通过sphinx提供的API reference,二是通过MySQL storage engine (SphinxSE)

我们采用的是第二种办法,也是我喜欢的方法,比较简单,不用修改太多程序。

由于我们sphinxse安装需要重新编译mysql,但是我们的环境是xampp安装包,这个编译有点麻烦,参考重新编译/构建Xampp中的MYSQL

五、sphinx配置

创建目录

mkdir /usr/local/coreseek/log

mkdir /usr/local/coreseek/pid

mkdir /usr/local/coreseek/data/dict

编辑sphinx.conf

cp /usr/local/coreseek/etc/sphinx.conf.dist  /usr/local/coreseek/etc/sphinx.conf

vi /usr/local/coreseek/etc/sphinx.conf

编辑修改如下内容

source yicike_search
{
type                                    = mysql
sql_host                                = localhost
sql_user                                =
sql_pass                                =
sql_db                                  = mainyicike
sql_port                                = 3306  # optional, default is 3306
sql_query_pre                   = set names utf8
sql_query_pre                   = SET SESSION query_cache_type=OFF
sql_query                               = \
select id,title,category_id  from pricecomparison_product
sql_attr_uint        = category_id
sql_ranged_throttle     = 0
}

index yicike_search
{
source                  = yicike_search
path                    = /usr/local/coreseek/data/yicike_search
docinfo                 = extern
mlock                   = 0
morphology              = none
stopwords               = /usr/local/coreseek/data/dict/stopwords.txt
min_word_len            = 1
charset_type            = zh_cn.utf-8
charset_dictpath        = /usr/local/coreseek/data/dict
min_prefix_len          = 0
min_infix_len           = 0
ngram_len               = 1
ngram_chars = U+4E00..U+9FBF, U+3400..U+4DBF, U+20000..U+2A6DF, U+F900..U+FAFF,\
U+2F800..U+2FA1F, U+2E80..U+2EFF, U+2F00..U+2FDF, U+3100..U+312F, U+31A0..U+31BF,\
U+3040..U+309F, U+30A0..U+30FF, U+31F0..U+31FF, U+AC00..U+D7AF, U+1100..U+11FF,\
U+3130..U+318F, U+A000..U+A48F, U+A490..U+A4CF
html_strip              = 0

}

source yicike_search_ctitle:yicike_search
{
sql_query                               = \
select id,title,PAGE_KEYWORDS from pricecomparison_category
sql_ranged_throttle     = 0
}

index yicike_search_ctitle
{
source                  = yicike_search_ctitle
path                    = /usr/local/coreseek/data/yicike_search_ctitle
docinfo                 = extern
mlock                   = 0
morphology              = none
stopwords               = /usr/local/coreseek/data/dict/stopwords.txt
min_word_len            = 1
charset_type            = zh_cn.utf-8
charset_dictpath        = /usr/local/coreseek/data/dict
min_prefix_len  = 0
min_infix_len           = 0
ngram_len               = 1
ngram_chars = U+4E00..U+9FBF, U+3400..U+4DBF, U+20000..U+2A6DF, U+F900..U+FAFF,\
U+2F800..U+2FA1F, U+2E80..U+2EFF, U+2F00..U+2FDF, U+3100..U+312F, U+31A0..U+31BF,\
U+3040..U+309F, U+30A0..U+30FF, U+31F0..U+31FF, U+AC00..U+D7AF, U+1100..U+11FF,\
U+3130..U+318F, U+A000..U+A48F, U+A490..U+A4CF
html_strip              = 0

}

indexer
{
mem_limit                       = 320M
# max_iops                      = 40
# max_iosize            = 1048576
}

searchd
{

listen                                = 3312
log =/usr/local/coreseek/log/searchd.log
query_log=/usr/local/coreseek/log/query.log
read_timeout=5
max_children=30
pid_file=/usr/local/coreseek/pid/searchd.pid
max_matches=1000000
seamless_rotate =1
preopen_indexes=0
unlink_old  =1
}

创建sphinxse数据表

CREATE TABLE IF NOT EXISTS `sphinx` (
`id` int(11) NOT NULL,
`weight` int(11) NOT NULL,
`query` varchar(255) NOT NULL,
`category_id` int(11) NOT NULL,
KEY `Query` (`query`)
) ENGINE=SPHINX DEFAULT CHARSET=utf8 CONNECTION=’sphinx://localhost:3312/yicike_search’;

CREATE TABLE IF NOT EXISTS `sphinxc` (
`id` int(11) NOT NULL,
`weight` int(11) NOT NULL,
`query` varchar(255) NOT NULL,
KEY `Query` (`query`)
) ENGINE=SPHINX DEFAULT CHARSET=utf8 CONNECTION=’sphinx://localhost:3312/yicike_search_ctitle’;

参考文档:

http://blog.csdn.net/syu/archive/2009/01/11/3754818.aspx

http://www.sphinxsearch.com/docs/current.html#sphinxse

http://www.coreseek.com/forum/index.php?action=vthread&forum=2&topic=165

http://blog.tom.com/benge_zhao/article/5052.html

作者: 独思客
原载: 亿枝客比较导购网
版权所有。转载时必须以链接形式注明作者和原始出处及本声明。

原创文章,转载请注明: 转载自亿赐客比较购物搜索网

本文链接地址: Centos Linux下安装coreseek/sphinx 全过程

广告位

留下评论