mongodb - Is there a way to query Mongo based on the length in bytes of a particular field? -
since hard limit of 1024 bytes indexes in mongodb 2.6.x, i've had remove useful compound index included text field quite long , contained high unicode characters exceeding byte limit.
i've had replace hashed index on single field forces mongodb open bson inspect other fields outside of hashed index.
i'd try , remove these long results (so can restore original compound index), don't know how query field's data exceed number of bytes. know way?
so far i've gone option...
i've created new field in data (which unfortunate since requires significant io). script goes through , sets value each document.
db.example.find({lb: {$exists: false}}).limit(200000).foreach(function (obj) { var lengthbytes = encodeuricomponent(obj.text).replace(/%[a-f\d]{2}/g, 'u').length; // print("id=" + obj._id + ";lenbytes=" + lengthbytes); db.example.update({ _id: obj._id }, {$set: { lb: numberint(lengthbytes)} }); });
i've done spot checks , values match http://mothereff.in/byte-counter
i can query long strings with:
db.example.find({lb: {$gt: 800}}).limit(20);
note: numberint
forces mongo store length int, otherwise it's stored floating
Comments
Post a Comment