如果我们已经把 测试数据 导入了elasticsearch,并安装好了插件head,那么就会看到Elasticsearch中已经有很多数据了。

如上,collections有86772条记录,products有538790条记录,users有970446条记录。

有了数据,我们该如何去套个查询的api呢?

首先我们拿products的随便一条记录来看看:

{
  "_index": "products",
  "_type": "product",
  "_id": "5246bc3dee910bfe70000063",
  "_version": 1,
  "_score": 1,
  "_source": {
    "category": "womens_fashion",
    "image_url": "https://s3.amazonaws.com/savvy_products/5246bc3dee910bfe70000063_1380367421",
    "original": 1,
    "description": "Soie Women's Bra",
    "likers": [],
    "country": "in",
    "price": "Rs. 440",
    "hashtags": [],
    "image_attrs": {
      "width": 0,
      "height": 0
    },
    "updated_at": "2013-10-24T07:11:25.143Z",
    "source": "http://www.flipkart.com/soie-women-s-bra/p/itmdhgfzu2uctvug?pid=BRADHGFZVGUEYGGF&ref=3496d1d6-47e9-4b44-a586-f66bc2243fd3",
    "featured": 1,
    "liker_ids": [],
    "comments_count": 0,
    "likes": 0,
    "urls": [],
    "image_s3_id": "5246bc3dee910bfe70000063_1380367421",
    "collection_id": "5246bc3dee910bfe70000062",
    "mentions": [],
    "created_at": "2013-09-28T11:23:41.944Z",
    "user": {
      "fb_id": "",
      "name": "Anonymous Aficionado",
      "id": "52432fca1fb2e64813000553"
    }
  }
}

里面字段可真不少,先拿个简单的,products里有个字段叫做description:

"description": "Soie Women's Bra",

我们就拿这个造个api,对description进行模糊查询。

那么首先就要改造products这个索引的配置,给它配上分词:

但是第一步提交数据的时候其实已经默认建立了配置,需要对它修改一下:

先停掉products的索引:

curl -XPOST http://localhost:9200/products/_close  

然后先给products加上分词:

curl -XPUT http://localhost:9200/products/_settings -d '  
{
  "settings": {
    "analysis": {
      "filter": {
        "autocomplete_filter": {
          "type": "edge_ngram",
          "min_gram": 1,
          "max_gram": 20
        }
      },
      "analyzer": {
        "autocomplete": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": [
            "lowercase",
            "autocomplete_filter"
          ]
        }
      }
    }
  }
}'

如上我们给products加了一个分词.

然后我们需要修改description的映射关系,把索引和搜索时用的filter配好。

curl -XPUT http://localhost:9200/products/_settings -d '  
{
  "mappings": {
    "product": {
      "description": {
        "category": {
          "type": "string",
          "analyzer": "autocomplete",
          "search_analyzer": "standard"
        }
      }
    }
  }
}'

如上,我们给description字段索引时增加了autocomplete的filter,搜索时用标准filter。

ok,经过上面的配置,description就会索引时自动分词。

然后我们再装个flask

easy_install flask  
easy_install flask_restful  
easy_install flask_cors  

写个简单的flask程序run.py:

from flask import Flask  
from flask_restful import reqparse, Resource, Api  
from flask_cors import CORS  
import requests  
import json

app = Flask(__name__)  
CORS(app)  
api = Api(app)

api_base_url = '/api/v1'  
es_base_url = {  
    'products': 'http://172.16.11.2:9200/products/product',
}

parser = reqparse.RequestParser()

class Search(Resource):

    def get(self):
        print("Call for GET /products/product/_search")
        parser.add_argument('q')
        query_string = parser.parse_args()
        url = es_base_url['products']+'/_search'
        query = {
            "query": {
                "multi_match": {
                    "fields": ["description"],
                    "query": query_string['q'],
                    "type": "cross_fields",
                    "use_dis_max": False
                }
            },
            "size": 10000
        }
        resp = requests.post(url, data=json.dumps(query))
        data = resp.json()
        products = []
        for hit in data['hits']['hits']:
            product = hit['_source']
            product['id'] = hit['_id']
            products.append(product)
        return products

api.add_resource(Search, api_base_url+'/search')

app.run(host='0.0.0.0',debug=True)  

运行这个flask程序:

python run.py  

然后打开url (http://xxx.xxx.xxx.xxx:5000/api/v1/search?q=Whi) 试一下结果:

ok,这样就ok了。

comments powered by Disqus