# Meet Variety, a Lightweight Schema Analyzer for MongoDB ###
This tool helps you get a sense of your application's schema, as well as any outliers to that schema. Particularly useful when you inherit a codebase with data dump and want to quickly learn how the data's structured. Also useful for finding rare keys.
This tool helps you get a sense of your application's schema, as well as any outliers to that schema. Particularly useful when you inherit a codebase with data dump and want to quickly learn how the data's structured. Also useful for finding rare keys.
### An Easy Example ###
### An Easy Example ###
...
@@ -12,7 +12,7 @@ We'll make a collection:
...
@@ -12,7 +12,7 @@ We'll make a collection:
So, let's see what we've got here:
So, let's see what we've got here:
$ mongo test --eval "var collection = 'users'" mongoDBSchemaAnalyzer.js
$ mongo test --eval "var collection = 'users'" variety.js
@@ -28,7 +28,7 @@ Interestingly, it looks like "pets" can be either an array or a string. Will thi
...
@@ -28,7 +28,7 @@ Interestingly, it looks like "pets" can be either an array or a string. Will thi
Seems like the first document created has a weird legacy key- those damn fools who built the protoype didn't clean up after themselves. If there were a thousand such early documents, I might cross-reference the codebase to confirm they are no longer used, and then delete them all. That way they'll not confuse any future developers.
Seems like the first document created has a weird legacy key- those damn fools who built the protoype didn't clean up after themselves. If there were a thousand such early documents, I might cross-reference the codebase to confirm they are no longer used, and then delete them all. That way they'll not confuse any future developers.
Results are stored for future use in a schemaAnalyzerResults database.
Results are stored for future use in a varietyResults database.
### See Progress When Analysis Takes a Long Time ###
### See Progress When Analysis Takes a Long Time ###
...
@@ -36,13 +36,13 @@ Tailing the log is great for this. Mongo provides a "percent complete" measureme
...
@@ -36,13 +36,13 @@ Tailing the log is great for this. Mongo provides a "percent complete" measureme
### Analyze Only Recent Documents ###
### Analyze Only Recent Documents ###
Perhaps you have a really large collection, and you can't wait a whole day for the Schema Analyzer's results.
Perhaps you have a really large collection, and you can't wait a whole day for the Variety's results.
Perhaps you want to ignore a collection's oldest documents, and only see what the collection's documents' structures have been looking like, as of late.
Perhaps you want to ignore a collection's oldest documents, and only see what the collection's documents' structures have been looking like, as of late.
One can apply a "limit" constraint, which analyzes only the newest documents in a collection, like so:
One can apply a "limit" constraint, which analyzes only the newest documents in a collection, like so: