published on in Tech
tags: database postgres json

PostgreSQL 10 JSON 全文检索(译)

译文版权@归原文所有.

这是对 JSON 数据的又一个很酷的补充. 现在我们可以轻松地添加 json 值的全文搜索.

一个它如何运作的快速例子:

$ select id, jsonb_pretty(payload) from test;
id |                                                jsonb_pretty
----+-------------------------------------------------------------------------------------------------------------
  1 | {                                                                                                          +
    |     "glossary": {                                                                                          +
    |         "title": "example glossary",                                                                       +
    |         "GlossDiv": {                                                                                      +
    |             "title": "S",                                                                                  +
    |             "GlossList": {                                                                                 +
    |                 "GlossEntry": {                                                                            +
    |                     "ID": "SGML",                                                                          +
    |                     "Abbrev": "ISO 8879:1986",                                                             +
    |                     "SortAs": "SGML",                                                                      +
    |                     "Acronym": "SGML",                                                                     +
    |                     "GlossDef": {                                                                          +
    |                         "para": "A meta-markup language, used to create markup languages such as DocBook.",+
    |                         "GlossSeeAlso": [                                                                  +
    |                             "GML",                                                                         +
    |                             "XML"                                                                          +
    |                         ]                                                                                  +
    |                     },                                                                                     +
    |                     "GlossSee": "markup",                                                                  +
    |                     "GlossTerm": "Standard Generalized Markup Language"                                    +
    |                 }                                                                                          +
    |             }                                                                                              +
    |         }                                                                                                  +
    |     }                                                                                                      +
    | }
(1 row)

正如你所看到的, 我有一个相当嵌套的 json 结构. 现在我们需要 tsvector 数据来构建一个索引. 我们可以这样:

$ select to_tsvector('english', payload) from test;
                                  to_tsvector
--------------------------------------------------------------------------------
 '1986':8 '8879':7 'creat':21 'docbook':26 'exampl':1 'general':35 'glossari':2.
. 'gml':28 'iso':6 'languag':18,23,37 'markup':17,22,32,36 'meta':16 'meta-mark.
.up':15 'sgml':4,10,12 'standard':34 'use':19 'xml':30
(1 row)

很好. 它找到了 json 数据值中的所有单词(它没有索引对象的键).

还有与之匹配的 ts_headline 函数.

$ select jsonb_pretty(ts_headline(payload, 'sgml'::tsquery)) from test;
                                                jsonb_pretty
-------------------------------------------------------------------------------------------------------------
 {                                                                                                          +
     "glossary": {                                                                                          +
         "title": "example glossary",                                                                       +
         "GlossDiv": {                                                                                      +
             "title": "S",                                                                                  +
             "GlossList": {                                                                                 +
                 "GlossEntry": {                                                                            +
                     "ID": "<b>SGML</b>",                                                                   +
                     "Abbrev": "ISO 8879:1986",                                                             +
                     "SortAs": "<b>SGML</b>",                                                               +
                     "Acronym": "<b>SGML</b>",                                                              +
                     "GlossDef": {                                                                          +
                         "para": "A meta-markup language, used to create markup languages such as DocBook.",+
                         "GlossSeeAlso": [                                                                  +
                             "GML",                                                                         +
                             "XML"                                                                          +
                         ]                                                                                  +
                     },                                                                                     +
                     "GlossSee": "markup",                                                                  +
                     "GlossTerm": "Standard Generalized Markup Language"                                    +
                 }                                                                                          +
             }                                                                                              +
         }                                                                                                  +
     }                                                                                                      +
 }
(1 row)

请注意上面SGML 字符串两边的 <b>.

讲真 - 我认为 JSON 被应用程序开发者滥用了很多, 但是这绝对是一个很好的补充, 非常感谢 Dmitry 和 Andrew.