Saturday, 15 August 2015

database design - MongoDB Schema Suggestion -



database design - MongoDB Schema Suggestion -

i trying pick mongodb preferred database. need help on design of table.

app background - analytics app contacts force own events , related custom data. contact can have many events. eg: contact did this, did etc.

event_type, custom_data (json), epoch_time

eg: event 1: event_type: page_visited, custom-data: {url: pricing, referrer: google}, current_time event 2: event_type: video_watched, custom-data: {url: video_link}, current_time event 3: event_type: paid, custom_data: {plan:lite, price:35}

these events custom , defined user. scalability concern.

these mutual utilize cases:

give me list of users have come pricing page in lastly 7 days give me list of users watched video , paid more 50 give me list of users have visited pricing, watched video not paid @ to the lowest degree 20

what's best way design table? thought utilize embedded events in case?

in mongo called collections , not tables, since info not rows/columns :)

(1) i'd create event collection , users collections

(2) i'd 1 document per event has userid in it.

(3) if need realtime info want index on want query (i.e. never scan on whole collection).

(4) if there things needed reporting only, i'd recommend making reporting node (i.e. different mongo instance) , using replication re-create info mongo instance. can set additional indexes reporting on node. way additional indexes , expensive queries not impact production performance.

notes on sharding

if events collection going become big - may need consider sharding. perhaps sharding user id. however, i'd recommend may longer term solution , not dive until need it.

one thing note, mongo has (2.6) database level write locking implementation. means can perform 1 write @ time. allows many reads. means if want high write scheme , have lot of users, need sharding @ point. however, in experience far, administratively 1 primary node secondary (and reporting node) easier setup. can handle around 10,000 operations per sec setup.

however, have had issues spikes in users coming system. you'll want create sure have plenty memory indexes. , ssd's recommended to. surge in users can result in cache misses (i.e. index not in memory) causes read off hard disk.

one final note - there lot of nosql db's , have pros , cons. found high write, low read, , realtime anaysis of lots of info not mongo's strength. depend on doing. sounds still learning fundamentals. might worth read of available types pick right tool right job.

mongodb database-design

No comments:

Post a Comment