A user Logstash consumes Kafka data and writes ES to report an error, 'cannot be changed from type [long] to [text]' processing process

South African camels say big data 2022-08-06 20:02:04 阅读数:786

userlogstashconsumeskafkadata

一、前言

某用户通过logstash消费Kafka数据写入ES时报错,cause data to be writtenES失败.需要排查原因

  1. The troubleshooting error message is as follows:

Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"XXXXXX-2022.08", :routing=>nil, :_type=>"_doc"}, #<LogStash::Event:0x153461e0>], :response=>{"index"=>{"_index"=>"XXXXXX-2022.08

", "_type"=>"_doc", "_id"=>"AKkgXoIBJ6THvkxd4efv", "status"=>400, "error"=>{"type"=>"illegal_argument_exception", "reason"=>"mapper [modules.subModules.items.value] cannot be changed from type [long] to [text]"}}}}

大意是"modules.subModules.items.value"The type of this field is long,但是写入ESbecomes a string typetext.A type conversion error caused the data write to fail.

二、数据测试

测试一:


PUT nginx-hezhenserver/_doc/1
{
"modules": [
{
"name": "IPC",
"subModules": [
{
"name": "cpuUse",
"items": [
{
"name": "all",
"value": "31.63"
},
{
"name": "cpu1",
"value": 29 //注意这里的value的值的不同.
}
]
}
]
}
]
}

报错如下:

Error in inserting document

测试二:

PUT nginx-hezhenserver/_doc/1
{
"modules": [
{
"name": "IPC",
"subModules": [
{
"name": "cpuUse",
"items": [
{
"name": "all",
"value": "31.63"
},
{
"name": "cpu1",
"value": "29" 加引号//注意这里的value的值的不同,将valueThe format of the value is changed to be consistent.
}
]
}
]
}
]
}

Data created successfully.

from the above comparison,我们可以看到,这种objectWhen the data type in the content of the object subfield of the type does not match,在存入ES的时候,会报类型转换错误,By default, it will be converted to string type storage.

三、解决办法

Because the field contents of this type of user are all numeric types,So here is the unified definitionlong. 可以在templateRiga a dynamic template,Used to write such fieldsES时自动转为long类型.Since here is the object fieldobject 子字段,这里需要用到path.match,The settings in the template are as follows:

 {
"message22": {
"mapping": {
"type": "long"
},
"path_match" : "modules.subModules.items.value"
}
}
//将此类index下的valueObject fields map tolong.

After executing the above template,再生成一个new index.

Still the previous test statement.

PUT nginx-hezhenserver/_doc/1
{
"modules": [
{
"name": "IPC",
"subModules": [
{
"name": "cpuUse",
"items": [
{
"name": "all",
"value": "31.63"
},
{
"name": "cpu1",
"value": 29
}
]
}
]
}
]
}

再次执行,发现执行成功.查看mapping,valuesThe subfield becomeslong.Other object fields becometext.

So the problem is basically solved here,但是由于用户logstashThe source data of consumption is not uniform,valueThe contents of the array are irregular,导致写入ES的时候还是报错.

PUT nginx-hezhenserver/_doc/1
{
"modules": [
{
"name": "IPC",
"subModules": [
{
"name": "cpuUse",
"items": [
{
"name": "all",
"value": "31.63"
},
{
"name": "cpu1",
"value": 29
},{
"name": "cpu1",
"value": "689MZH" //字符串类型的value
}
]
}
]
}
]
}

报错如下:

写入报错2

This means this is writtenvalue里的内容ESIt is recognized that the input isString.不能转成long. 解析失败.

Because the user side cannot control the standardization of the written data,那只能ESSide readjust the template,直接改为text类型即可.调整如下:

{
"message22": {
"mapping": {
"type": "text"
},
"path_match" : "modules.subModules.items.value"
}
}

再次执行上述index.The index is created successfully.

Then adjust the above,customers like thislogstash消费kafkaThe problem of abnormal data writing is completely solved.

四、总结

这里主要用到了Elasticseach的动态模板,Object fields are usedpath_match组合,This issue is specifically documented.

copyright:author[South African camels say big data],Please bring the original link to reprint, thank you. https://en.javamana.com/2022/218/202208061954002361.html