mongodb aggregate 基于UNIX时间戳的聚合

alt

对今天解决不了的事情,也不要着急。因为明天也可能还是解决不了。

前提

开发找我问MongoDBaggregate 聚合用过不😰

我说查查资料吧,发现是aggregate聚合管道 😨

把SQL 与 Aggergation 对比下😂

SQL Terms, Functions, and Concepts MongoDB Aggregation Operators
WHERE $match
HAVING $match
SELECT $project
ORDER BY $sort
LIMIT $limit
SUM() $sum
COUNT() $sum
COUNT() $sortByCount
join $lookup

开发的需求是:

  • 按照天分组,统计一下数量

经过查看MongoDB官网aggregate资料实现开发的需求

实现

查询数据的状态

1
db.antispam_image.find({"moduleId":5,createTs:{$gte:1554048000},createTs:{$lt:1556640000}}).pretty().limit(1);

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
{
"_id" : ObjectId("5c125657372eecd7a0a54e5e"),
"adRate" : 0,
"assoId" : "223794113",
"assoType" : 201,
"clientIp" : "183.131.7.23",
"clientType" : "",
"confidence" : 12.655,
"content" : "****http://videoplayer.babytreeimg.com/lamavideo:2018/1213/Flb_T4qsxRcK7suNqaHo2JbD6zWJ_000001.jpg?id=-1\**",
"createTs" : NumberLong(1544705623),
"emailStatus" : "verified",
"handleTs" : NumberLong(1545036013),
"hotScore" : 14.285,
"inspectStatus" : 2,
"message" : "",
"moduleId" : NumberLong(5),
"normalScore" : 73.25,
"ocrKeyword" : [ ],
"ocrText" : [ ],
"opUser" : "fanyanning",
"opUserId" : NumberLong(31648494),
"pornScore" : 12.464,
"qcode" : 0,
"receiveTs" : NumberLong(1545035885),
"regTs" : NumberLong(1475385061),
"requestId" : "6f078c58-1235-4816-96a5-e876ea0cbcf2",
"rotOcrKeyword" : [ ],
"rotOcrText" : [ ],
"ruleId" : NumberLong(10019),
"sim" : 0,
"status" : 1000,
"trashType" : 0,
"ts" : NumberLong(1544705623),
"url" : "http://videoplayer.babytreeimg.com/lamavideo:2018/1213/Flb_T4qsxRcK7suNqaHo2JbD6zWJ_000001.jpg?id=-1",
"userId" : NumberLong(56349617),
"userLevel" : 3,
"version" : "1.0.0"
}

开发给的聚合的查询

1
2
3
4
db.antispam_image.aggregate([
{ $match: {"moduleId":5,createTs:{$gte:1554048000},createTs:{$lt:1556640000}}},
{ $group: {_id :{$dateToString: {format: "%Y-%m-%d", date: "$createTs" }},count: { $sum: 1 }}}
]);

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
mgset-11469021:SECONDARY> db.antispam_image.aggregate([{$match: {"moduleId":5, createTs:{$gte:1554048000},createTs:{$lt:1556640000}}}, {$group: {_id: {$dateToString: {format: "%Y-%m-%d", date: "$createTs"}}, count: {$sum: 1}}}]);
assert: command failed: {
"operationTime" : Timestamp(1558347069, 2),
"ok" : 0,
"errmsg" : "can't convert from BSON type long to Date",
"code" : 16006,
"codeName" : "Location16006"
} : aggregate failed
_getErrorWithCode@src/mongo/shell/utils.js:25:13
doassert@src/mongo/shell/assert.js:16:14
assert.commandWorked@src/mongo/shell/assert.js:370:5
DBCollection.prototype.aggregate@src/mongo/shell/collection.js:1319:5
@(shell):1:1

2019-05-20T18:11:10.818+0800 E QUERY [thread1] Error: command failed: {
"operationTime" : Timestamp(1558347069, 2),
"ok" : 0,
"errmsg" : "can't convert from BSON type long to Date",
"code" : 16006,
"codeName" : "Location16006"
} : aggregate failed :
_getErrorWithCode@src/mongo/shell/utils.js:25:13
doassert@src/mongo/shell/assert.js:16:14
assert.commandWorked@src/mongo/shell/assert.js:370:5
DBCollection.prototype.aggregate@src/mongo/shell/collection.js:1319:5
@(shell):1:1

报错了,问开发。开发又给了一个查询语句

1
2
3
4
5
6
7
8
db.antispam_image.aggregate([
{ $match: {"moduleId":5,createTs:{$gte:1554048000},createTs:{$lt:1556640000}}},
{ $group: {
_id :{ $dateToString: {format: "%Y-%m-%d", date:{"$add":[new Date(0),"$createTs"]}} },
count: { $sum: 1 }
}
}
]);

1
2
{ "_id" : "1970-01-19", "count" : 15131 }
{ "_id" : "1970-01-18", "count" : 180400 }

发现时间戳转化不对

于是查看官网资料,继续改写先把UNIX时间戳转化为日期。可是MongoDB又没有MySQL那种FROM_UNIXTIME()UNIX_TIMESTAMP()函数。只能自己造

通过将值乘以1000将createTs字段转换为毫秒时间戳

1
{ "$multiply": [1000, "$createTs"]}

$multiply 将数字相乘以返回产品。接受任意数量的参数表达式。

然后转换为日期

1
"$add": [ new Date(0), { "$multiply": [1000, "$createTs"]} ]

继续组装查询

1
{"$group": { "_id": { "year": { "$year": { "$add": [ new Date(0), { "$multiply": [1000, "$createTs"] } ]}}, "mmonth": { "$month": { "$add": [ new Date(0), { "$multiply": [1000, "$createTs"] } ]}}, "day": { "$dayOfMonth": { "$add": [ new Date(0), { "$multiply": [1000, "$createTs"] } ]}}}, "count" : { "$sum" : 1 }}}

$project管道中完成,方法是将毫秒时间添加到零毫秒Date(0)对象,然后从转换后的日期中提取$year,$month,$dayOfMonth个零件,可以在$group管道中使用这些零件对文档进行分组

完整的查询语句拼接出来

1
db.antispam_image.aggregate([{ $match: {"moduleId":5,createTs:{$gte:1554048000},createTs:{$lt:1556640000}}}, {"$group": { "_id": { "year": { "$year": { "$add": [ new Date(0), { "$multiply": [1000, "$createTs"] } ]}}, "mmonth": { "$month": { "$add": [ new Date(0), { "$multiply": [1000, "$createTs"] } ]}}, "day": { "$dayOfMonth": { "$add": [ new Date(0), { "$multiply": [1000, "$createTs"] } ] }}}, "count" : { "$sum" : 1 }}}]);

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
{ "_id" : { "year" : 2019, "mmonth" : 4, "day" : 30 }, "count" : 624 }
{ "_id" : { "year" : 2019, "mmonth" : 4, "day" : 28 }, "count" : 695 }
{ "_id" : { "year" : 2019, "mmonth" : 4, "day" : 27 }, "count" : 683 }
{ "_id" : { "year" : 2019, "mmonth" : 4, "day" : 26 }, "count" : 765 }
{ "_id" : { "year" : 2019, "mmonth" : 1, "day" : 12 }, "count" : 610 }
{ "_id" : { "year" : 2019, "mmonth" : 1, "day" : 4 }, "count" : 429 }
{ "_id" : { "year" : 2019, "mmonth" : 1, "day" : 3 }, "count" : 475 }
{ "_id" : { "year" : 2019, "mmonth" : 4, "day" : 29 }, "count" : 732 }
{ "_id" : { "year" : 2018, "mmonth" : 12, "day" : 31 }, "count" : 592 }
{ "_id" : { "year" : 2018, "mmonth" : 12, "day" : 30 }, "count" : 542 }
{ "_id" : { "year" : 2019, "mmonth" : 3, "day" : 14 }, "count" : 1155 }
{ "_id" : { "year" : 2019, "mmonth" : 3, "day" : 13 }, "count" : 1169 }
{ "_id" : { "year" : 2019, "mmonth" : 4, "day" : 20 }, "count" : 945 }
{ "_id" : { "year" : 2019, "mmonth" : 3, "day" : 15 }, "count" : 1062 }
{ "_id" : { "year" : 2019, "mmonth" : 4, "day" : 24 }, "count" : 751 }
{ "_id" : { "year" : 2018, "mmonth" : 10, "day" : 15 }, "count" : 721 }
{ "_id" : { "year" : 2019, "mmonth" : 4, "day" : 18 }, "count" : 895 }
{ "_id" : { "year" : 2018, "mmonth" : 10, "day" : 16 }, "count" : 713 }
{ "_id" : { "year" : 2019, "mmonth" : 4, "day" : 14 }, "count" : 1278 }
{ "_id" : { "year" : 2018, "mmonth" : 10, "day" : 17 }, "count" : 583 }
Type "it" for more
mgset-11469021:SECONDARY>

拿这样查询出来的数据问开发是否是这样、开发确认这样可以。需求解决。

开发写了一个查询

1
2
3
4
5
6
7
8
9
10
db.antispam_report.aggregate(
[
{
$project: {
createTs: 1,
date1Str: {$dateToString: {format: "%Y-%m-%d", date:{"$add":[new Date(0),{"$multiply":["$createTs",1000]},28800000]}}}
}
}
]
)

添加了时区时间28800000

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
{ "_id" : ObjectId("5bd41d96e870a1daab7a0d6d"), "createTs" : NumberLong(1540627862), "date1Str" : "2018-10-27" }
{ "_id" : ObjectId("5bd8fc96e870a1daab5e99c1"), "createTs" : NumberLong(1540947095), "date1Str" : "2018-10-31" }
{ "_id" : ObjectId("5bf9ed96e870a1daabe6b236"), "createTs" : NumberLong(1543105942), "date1Str" : "2018-11-25" }
{ "_id" : ObjectId("5c025140e870a1daaba6f00c"), "createTs" : NumberLong(1543655744), "date1Str" : "2018-12-01" }
{ "_id" : ObjectId("5bffc529e870a1daabadff49"), "createTs" : NumberLong(1543488809), "date1Str" : "2018-11-29" }
{ "_id" : ObjectId("5c0ce4b8e870a1daab28b208"), "createTs" : NumberLong(1544348856), "date1Str" : "2018-12-09" }
{ "_id" : ObjectId("5bf23dc4e870a1daab3b698c"), "createTs" : NumberLong(1542602181), "date1Str" : "2018-11-19" }
{ "_id" : ObjectId("5bf1fdc2e870a1daab7729b1"), "createTs" : NumberLong(1542585793), "date1Str" : "2018-11-19" }
{ "_id" : ObjectId("5bf9ed8be870a1daabe57c10"), "createTs" : NumberLong(1543105931), "date1Str" : "2018-11-25" }
{ "_id" : ObjectId("5c064d15e870a1daabcb43bd"), "createTs" : NumberLong(1543916821), "date1Str" : "2018-12-04" }
{ "_id" : ObjectId("5be44e5ce870a1daabb1b84f"), "createTs" : NumberLong(1541688924), "date1Str" : "2018-11-08" }
{ "_id" : ObjectId("5bffeafee870a1daab044619"), "createTs" : NumberLong(1543498494), "date1Str" : "2018-11-29" }
{ "_id" : ObjectId("5c076f01e870a1daabe32b96"), "createTs" : NumberLong(1543991041), "date1Str" : "2018-12-05" }
{ "_id" : ObjectId("5bdc4cc8e870a1daab855084"), "createTs" : NumberLong(1541164232), "date1Str" : "2018-11-02" }
{ "_id" : ObjectId("5bc343d8e870a1daab708be1"), "createTs" : NumberLong(1539523544), "date1Str" : "2018-10-14" }
{ "_id" : ObjectId("5bfcd64de870a1daab0d2617"), "createTs" : NumberLong(1543296589), "date1Str" : "2018-11-27" }
{ "_id" : ObjectId("5bc9adefe870a1daabf4ab80"), "createTs" : NumberLong(1539943919), "date1Str" : "2018-10-19" }
{ "_id" : ObjectId("5bbd8c31e870a1daab26cf3e"), "createTs" : NumberLong(1539148850), "date1Str" : "2018-10-10" }
{ "_id" : ObjectId("5beff335e870a1daab8be35a"), "createTs" : NumberLong(1542452021), "date1Str" : "2018-11-17" }
{ "_id" : ObjectId("5bc9ae33e870a1daabf4c03a"), "createTs" : NumberLong(1539943987), "date1Str" : "2018-10-19" }
Type "it" for more

结论

针对DBA这个岗位来说。大多数都是从事MongoDB 运维工作 😂

很少贴近开发需求,这次开发问了我这个问题。我当然无法立马给出答案。只能不断的查询。拼接,才能马马虎虎的满足了开发的需求

刚才问一个架构师,架构师说有一个更简单的方式:

  1. 查出总结出一天的数据,放到管道中临时保存起来
  2. 在用前一次查询的结束时间作为第二天的开始时间,在加上一天的时间(86400s)得出结尾时间。
  3. 查询完成在统一显示打印出来

这种方式就需要使用MongoDB forEach方式实现了。

开发又说不能按时间排序😱 , 妹的😂哪里来的这么多要求😱

参考

-------------本文结束感谢您的阅读-------------

本文标题:mongodb aggregate 基于UNIX时间戳的聚合

文章作者:Wang Jiemin

发布时间:2019年05月20日 - 18:05

最后更新:2019年05月21日 - 15:05

原始链接:https://jiemin.wang/2019/05/20/mongodb-aggregate/

许可协议: 署名-非商业性使用-禁止演绎 4.0 国际 转载请保留原文链接及作者。

0%