Commit 97a34e7f by Wes Freeman

Merge pull request #41 from todvora/master

tests refactoring (rewritten to java), tested all given examples and more
parents d26bfa08 7bd4f6a9
*~ *~
# intellij idea project files
\ No newline at end of file
language: node_js language: java
- oraclejdk8
services: mongodb services: mongodb
- "0.11"
install: install:
- npm install jshint -g - npm install jshint -g
- npm install jasmine-node -g
- npm install mongodb -g
script: script:
- jshint variety.js - jshint variety.js
- test/ - cd test && mvn test
\ No newline at end of file \ No newline at end of file
## Variety tests ## Variety tests
Tests are primary configured for [Travis-CI]( platform. See `.travis.yml` in repository (`install` and `script` section). Tests are primary configured for [Travis-CI]( platform. See `.travis.yml` in repository (`script` section).
It's easy to run tests locally, but you need to prepare your environment little bit. This readme assumes linux machine, mac would probably work too.
On windows you need [cygwin]( or some way how to run bash scripts.
## Dependencies ## Dependencies
[MongoDB]( installed, of course. Tests are written in [Jasmine](, Behavior-Driven JavaScript test framework. [MongoDB]( installed, of course. Tests are written in [JUnit](, using [Java 8](http:// [Maven 3]( is required.
In order to run Jasmine from command line, [Node.js]( is required. You should have Java 8 and Maven installed. Junit and other dependencies are then automatically handled by Maven (see `test/pom.xml`).
Integration between Node.js and Jasmine ensures [jasmine-node]( package.
Tests connect to MongoDB via node.js connector [node-mongodb-native](
## Run tests locally ## Run tests locally
Install node dependencies globally. This step is necessary just for the first run. After that, packages are installed and ready to use.
npm install jasmine-node -g
npm install mongodb -g
Run `test/` from repository base page. It will run whole lifecycle of tests. Assuming running MongoDB, go to directory `variety/test` (you should see `pom.xml` there) and run `mvn test`.
Main indicator of tests result is [exit code]( of script. Main indicator of tests result is [exit code]( of script.
In case of everything went well, return code is `0`. In case of tests fail, exit code is set to nonzero. Exit code is monitored by Travis-CI and informs about tests success or fail. In case of everything went well, return code is `0`. In case of tests fail, exit code is set to nonzero. Exit code is monitored by Travis-CI and informs about tests success or fail.
Tests produce verbose log messages for detecting problems and errors. Tests produce verbose log messages for detecting problems and errors.
## Tests lifecycle ## Tests lifecycle
- Initialization, prepare data, see `test/init.js` - Initialization, prepare data. Every test has method annotated with `@Before`.
- Variety analysis, run variety.js against prepared data - Variety analysis, run variety.js against prepared data and verify results. See ``, method `runAnalysis()` and methods annotated with `@Test`.
- Jasmine tests, see `test/variety_spec.js` - Resources cleanup, see method annotated with `@After`.
- Resources cleanup, see `test/cleanup.js`
## Used databases and collections ## Used databases and collections
Tests use two databases, `test` and `varietyResults`. In DB `test`, there will be created collection `users`. Tests use two databases, `test` and `varietyResults`. In DB `test`, there will be created collection `users`.
Collection is later analyzed by variety and results stored in DB `varietyResults`, collection `usersKeys`. Collection is later analyzed by variety and results stored in DB `varietyResults`, collection `usersKeys`.
Cleanup script removes `test.users` and `varietyResults.usersKeys` after tests run. It does not remove any database. Cleanup method should remove both test and analysis data.
## Contribute ## Contribute
You can extend `variety_spec.js` or create new JavaScript file with extension `_spec.js` (for example `max-depth_spec.js`). You can extend current test cases or create new JUnit test. All tests under `test/src/test/` are automatically included into run.
All `_spec.js` files are automatically included in tests by jasmine. \ No newline at end of file
\ No newline at end of file
// clean all resources created in init.js and during Variety execution and tests.
use test;
use varietyResults;
\ No newline at end of file
// This script contains initial data from Variety README. It will be called once before test run. All data created here
// should be removed in cleanup.js script. It is not necessary on Travis-CI, but very useful on local environments
use test;
db.users.insert({name: "Tom", bio: "A nice guy.", pets: ["monkey", "fish"], someWeirdLegacyKey: "I like Ike!"});
db.users.insert({name: "Dick", bio: "I swordfight.", birthday: new Date("1974/03/14")});
db.users.insert({name: "Harry", pets: "egret", birthday: new Date("1984/03/14")});
db.users.insert({name: "Geneviève", bio: "Ça va?"});
db.users.insert({name: "Jim", someBinData: new BinData(2,"1234")});
\ No newline at end of file
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns=""
\ No newline at end of file
package com.github.variety;
import com.mongodb.DB;
import com.mongodb.DBCollection;
import com.mongodb.MongoClient;
import java.nio.charset.StandardCharsets;
import java.nio.file.Paths;
import java.util.Arrays;
import java.util.StringJoiner;
* Variety wrapper, provides access to MongoDB database, collection and execution of variety analysis.
public class Variety {
* Hardcoded database name in variety.js for analysis results
public static final String VARIETY_RESULTS_DBNAME = "varietyResults";
public static final String PARAM_QUERY = "query";
public static final String PARAM_SORT = "sort";
public static final String PARAM_MAXDEPTH = "maxDepth";
public static final String PARAM_LIMIT = "limit";
private final String inputDatabase;
private final String inputCollection;
private final MongoClient mongoClient;
private Integer limit;
private Integer maxDepth;
private String query;
private String sort;
private boolean verbose = true;
* Create variety wrapper with defined connection do analysed database and collection
* @param database name of database, that will be analysed
* @param collection name of collection, that will be analysed
* @throws UnknownHostException Thrown when fails connection do default host and port of MongoDB
public Variety(final String database, final String collection) throws UnknownHostException {
this.inputDatabase = database;
this.inputCollection = collection;
this.mongoClient = new MongoClient();
* @return Access to MongoDB database, where variety stores computed results
public DB getVarietyResultsDatabase() {
return mongoClient.getDB(VARIETY_RESULTS_DBNAME);
* @return Access to collection with source data, that are provided for analysis
public DBCollection getSourceCollection() {
return mongoClient.getDB(inputDatabase).getCollection(inputCollection);
* Variety wrapper for {@code limit} option
public Variety withLimit(final Integer limit) {
this.limit = limit;
return this;
* Variety wrapper for {@code maxDepth} option
public Variety withMaxDepth(final Integer maxDepth) {
this.maxDepth = maxDepth;
return this;
* Variety wrapper for {@code query} option
public Variety withQuery(final String query) {
this.query = query;
return this;
* Variety wrapper for {@code sort} option
public Variety withSort(final String sort) {
this.sort = sort;
return this;
* Enable analysis output stdout of script to stdout of java process.
* Deprecated because it should only be used for debugging of test, not real/production tests itself. If you
* need to read stdout of variety, it can be accessed through {@link VarietyAnalysis#getStdOut()}
public Variety verbose() {
this.verbose = true;
return this;
* Executes mongo shell with configured variety options and variety.js script in path.
* @return Results of analysis including stdout of variety.js and verifier of collected keys
* @throws IOException
* @throws InterruptedException
public VarietyAnalysis runAnalysis() throws IOException, InterruptedException {
final String[] commands = new String[]{"mongo", this.inputDatabase, "--eval", buildParams(), getVarietyPath()};
final Process child = Runtime.getRuntime().exec(commands);
final int returnCode = child.waitFor();
final String stdOut = readStream(child.getInputStream());
if(returnCode != 0) {
throw new RuntimeException("Failed to execute variety.js with arguments: " + Arrays.toString(commands) + ".\n" + stdOut);
} else if(verbose) {
return new VarietyAnalysis(mongoClient, inputCollection, stdOut);
* @return Params passed to mongo client together with variety. Collection name is always present, other are optional
private String buildParams() {
final StringJoiner args = new StringJoiner(",");
args.add("var collection = '" + inputCollection + "'");
if(limit != null) {
args.add(PARAM_LIMIT + " = " + limit);
if(maxDepth != null) {
args.add(PARAM_MAXDEPTH + " = " + maxDepth);
if(query != null && !query.isEmpty()) {
args.add(PARAM_QUERY + " = " + query);
if(sort != null && !sort.isEmpty()) {
args.add(PARAM_SORT + " = " + sort);
return args.toString();
* @return detect absolute path to variety.js, stored in same repository as this tests.
private String getVarietyPath() {
// TODO: is there any better way how to compute relative path to variety.js?
// relative path from maven compiled classes root to variety.js file.
return Paths.get(this.getClass().getResource("/").getFile()).getParent().getParent().getParent().resolve("variety.js").toString();
* Converts input stream to String containing lines separated by \n
private String readStream(final InputStream stream) {
final BufferedReader reader = new BufferedReader(new InputStreamReader(stream, StandardCharsets.UTF_8));
final StringJoiner builder = new StringJoiner("\n");
return builder.toString();
package com.github.variety;
import com.mongodb.*;
import org.junit.Assert;
import java.util.Arrays;
* Results of variety.js run in mongo shell. Contains stdout of shell and access to results collection. For convenience there
* is defined method verifyResult, that checks correct types and occurrences of desired key.
public class VarietyAnalysis {
private final MongoClient mongoClient;
private final String sourceCollectionName;
private final String stdOut;
* @param mongoClient connection to MongoDB
* @param sourceCollectionName name of original source collection. Used to access results in variety database
* @param stdOut output of analysis execution - output of variety.js script
public VarietyAnalysis(final MongoClient mongoClient, final String sourceCollectionName, final String stdOut) {
this.mongoClient = mongoClient;
this.sourceCollectionName = sourceCollectionName;
this.stdOut = stdOut;
* Verifier for collected results in variety analysis
* @param key Results should contain entry with this key
* @param totalOccurrences Results should contain entry with this total occurrences
* @param percentContaining Results should contain entry with this relative percentage
* @param types Expected data types of this entry (Based on MongoDB type names)
public void verifyResult(final String key, final double totalOccurrences, final double percentContaining, final String... types) {
final DBCursor cursor = getResultsCollection().find(new BasicDBObject("_id.key", key));
Assert.assertEquals("Entry with key '" + key + "' not found in variety results", 1, cursor.size());
final DBObject result =;
verifyKeyTypes(key, result, types);
Assert.assertEquals("Failed to verify total occurrences of key " + key, totalOccurrences, result.get("totalOccurrences"));
Assert.assertEquals("Failed to verify percents of key " + key, percentContaining, result.get("percentContaining"));
private void verifyKeyTypes(final String key, final DBObject result, final String[] expectedTypes) {
final BasicDBList types = (BasicDBList)((DBObject) result.get("value")).get("types");
"Incorrect count of expected(" + Arrays.toString(expectedTypes) + ") and real types(" + Arrays.toString(types.toArray())
+ ") of key: " + key, expectedTypes.length, types.size());
for (final String expected : expectedTypes) {
if (!types.contains(expected)) {"Type '" + expected + "' not found in real types(" + Arrays.toString(expectedTypes) + ") of key: " + key);
* @return Direct access to variety results collection of this analysis
public DBCollection getResultsCollection() {
return mongoClient.getDB(Variety.VARIETY_RESULTS_DBNAME).getCollection(getResultsCollectionName());
* @return Standard output of mongo client with variety.js analysis script executed.
public String getStdOut() {
return stdOut;
* @return name of variety results collection name. Format is {_original_name_}Keys. For collection cars it will be carsKeys.
private String getResultsCollectionName() {
return sourceCollectionName + "Keys";
package com.github.variety.test;
import com.github.variety.Variety;
import com.github.variety.VarietyAnalysis;
import org.junit.After;
import org.junit.Before;
import org.junit.Test;
* Tests basic collection structure provided in readme of variety
public class BasicAnalysisTest {
private Variety variety;
public void setUp() throws Exception {
this.variety = new Variety("test", "users");
public void tearDown() throws Exception {
public void verifyBasicResults() throws Exception {
final VarietyAnalysis analysis = variety.runAnalysis();
analysis.verifyResult("_id", 5, 100, "ObjectId");
analysis.verifyResult("name", 5, 100, "String");
analysis.verifyResult("bio", 3, 60, "String");
analysis.verifyResult("pets", 2, 40, "String", "Array");
analysis.verifyResult("someBinData", 1, 20, "BinData-old");
analysis.verifyResult("someWeirdLegacyKey", 1, 20, "String");
package com.github.variety.test;
import com.github.variety.Variety;
import com.github.variety.VarietyAnalysis;
import com.mongodb.BasicDBObject;
import org.bson.types.Binary;
import org.junit.After;
import org.junit.Assert;
import org.junit.Before;
import org.junit.Test;
import java.util.ArrayList;
import java.util.Date;
* Verify, that variety can recognize all usual datatypes, including different bindata types.
* This test addresses issue
public class DatatypeRecognitionTest {
private Variety variety;
public void setUp() throws Exception {
this.variety = new Variety("test", "users");
variety.getSourceCollection().insert(new BasicDBObject()
.append("key_string", "Just plain String")
.append("key_boolean", true)
.append("key_number", 1)
.append("key_date", new Date())
.append("key_binData-generic", new Binary((byte)0x00, new byte[]{1,2,3,4}))
.append("key_binData-function", new Binary((byte) 0x01, new byte[]{1,2,3,4}))
.append("key_binData-old", new Binary((byte) 0x02, new byte[]{1,2,3,4}))
.append("key_binData-UUID", new Binary((byte) 0x03, new byte[]{1,2,3,4}))
.append("key_binData-MD5", new Binary((byte) 0x05, new byte[]{1,2,3,4}))
.append("key_binData-user", new Binary((byte) 0x80, new byte[]{1,2,3,4}))
.append("key_array", new ArrayList<>())
.append("key_object", new BasicDBObject())
.append("key_null", null)
public void tearDown() throws Exception {
public void testDatatypeRecognition() throws Exception {
final VarietyAnalysis analysis = variety.runAnalysis();
Assert.assertEquals(14, analysis.getResultsCollection().count());
analysis.verifyResult("_id", 1, 100, "ObjectId");
analysis.verifyResult("key_string", 1, 100, "String");
analysis.verifyResult("key_boolean", 1, 100, "Boolean");
analysis.verifyResult("key_number", 1, 100, "Number");
analysis.verifyResult("key_date", 1, 100, "Date");
analysis.verifyResult("key_binData-generic", 1, 100, "BinData-generic");
analysis.verifyResult("key_binData-function", 1, 100, "BinData-function");
analysis.verifyResult("key_binData-old", 1, 100, "BinData-old");
analysis.verifyResult("key_binData-UUID", 1, 100, "BinData-UUID");
analysis.verifyResult("key_binData-MD5", 1, 100, "BinData-MD5");
analysis.verifyResult("key_binData-user", 1, 100, "BinData-user");
analysis.verifyResult("key_array", 1, 100, "Array");
analysis.verifyResult("key_object", 1, 100, "Object");
analysis.verifyResult("key_null", 1, 100, "null"); // TODO: why has 'null' first letter lowercase, unlike all other types?
package com.github.variety.test;
import com.github.variety.Variety;
import com.github.variety.VarietyAnalysis;
import com.mongodb.DBObject;
import com.mongodb.util.JSON;
import org.junit.After;
import org.junit.Assert;
import org.junit.Before;
import org.junit.Test;
* Variety outputs json-like results to stdout. Lets verify that.
public class JsonOutputTest {
private Variety variety;
public void setUp() throws Exception {
this.variety = new Variety("test", "users");
public void tearDown() throws Exception {
public void verifyJsonEntries() throws Exception {
final VarietyAnalysis analysis = variety.runAnalysis();
// TODO: output itself is not valid JSON. It contains mongo shell output (can be removed with --quiet) and variety execution info.
// At the end of output, there are printed records from result collection, every record on new line.
// Valid json output is requested in issue
// Verify, that every object is parse-able json by transforming strings to json stream
// Results are detected by line starting with character '{'.
final Stream<DBObject> objects = Stream.of(analysis.getStdOut().split("\n"))
.filter(line -> line.startsWith("{"))
.map(str -> (DBObject)JSON.parse(str));
// there should be seven different json results in the stdout
Assert.assertEquals(7, objects.count());
package com.github.variety.test;
import com.github.variety.Variety;
import com.github.variety.VarietyAnalysis;
import org.junit.After;
import org.junit.Before;
import org.junit.Test;
* Tests limit functionality of variety. It should analyse only first _n_ objects and then compute occurrences from
* all objects in collection.
public class LimitResultsAnalysisTest {
private Variety variety;
public void setUp() throws Exception {
this.variety = new Variety("test", "users");
public void tearDown() throws Exception {
public void verifyLimitedResults() throws Exception {
final VarietyAnalysis analysis = variety.withLimit(1).runAnalysis();
analysis.verifyResult("_id", 5, 100, "ObjectId");
analysis.verifyResult("name", 5, 100, "String");
// TODO: there is only one document with 'someBinData'. Why variety returns 5/100% instead of 1/20% ?
// FIXME: analysis.verifyResult("someBinData", 1, 20, "BinData-old");
package com.github.variety.test;
import com.github.variety.Variety;
import com.github.variety.VarietyAnalysis;
import com.mongodb.DBObject;
import com.mongodb.util.JSON;
import org.junit.After;
import org.junit.Assert;
import org.junit.Before;
import org.junit.Test;
public class MaxDepthAnalysisTest {
private static final double EXPECTED_PERCENTS = 100; //TODO: why documentation mentions be 16.66666666666666, when there is only one document at all ?
private Variety variety;
public void setUp() throws Exception {
variety = new Variety("test", "users");
variety.getSourceCollection().insert((DBObject) JSON.parse("{name:'Walter', someNestedObject:{a:{b:{c:{d:{e:1}}}}}})"));
public void tearDown() throws Exception {
public void testUnlimitedAnalysis() throws Exception {
final VarietyAnalysis analysis = variety.runAnalysis();
Assert.assertEquals("Variety results have not correct count of entries", 8, analysis.getResultsCollection().count()); // 8 results, including '_id' and 'name'
analysis.verifyResult("_id", 1, EXPECTED_PERCENTS, "ObjectId");
analysis.verifyResult("name", 1, EXPECTED_PERCENTS, "String");
analysis.verifyResult("someNestedObject", 1, EXPECTED_PERCENTS, "Object");
analysis.verifyResult("someNestedObject.a", 1, EXPECTED_PERCENTS, "Object");
analysis.verifyResult("someNestedObject.a.b", 1, EXPECTED_PERCENTS, "Object");
analysis.verifyResult("someNestedObject.a.b.c", 1, EXPECTED_PERCENTS, "Object");
analysis.verifyResult("someNestedObject.a.b.c.d", 1, EXPECTED_PERCENTS, "Object");
analysis.verifyResult("someNestedObject.a.b.c.d.e", 1, EXPECTED_PERCENTS, "Number");
public void testLimitedDepthAnalysis() throws Exception {
final VarietyAnalysis analysis = variety.withMaxDepth(3).runAnalysis();
// TODO: depth 3 means 'someNestedObject.a.b' or 'someNestedObject.a.b.c'? Documentation describes the first variant, variety counts also second.
// FIXME: Assert.assertEquals("Variety results have not correct count of entries", 5, analysis.getResultsCollection().count()); // 5 results, including '_id' and 'name'
analysis.verifyResult("_id", 1, EXPECTED_PERCENTS, "ObjectId");
analysis.verifyResult("name", 1, EXPECTED_PERCENTS, "String");
analysis.verifyResult("someNestedObject", 1, EXPECTED_PERCENTS, "Object");
analysis.verifyResult("someNestedObject.a", 1, EXPECTED_PERCENTS, "Object");
analysis.verifyResult("someNestedObject.a.b", 1, EXPECTED_PERCENTS, "Object");
package com.github.variety.test;
import com.github.variety.Variety;
import com.github.variety.VarietyAnalysis;
import org.junit.After;
import org.junit.Assert;
import org.junit.Before;
import org.junit.Test;
import java.util.Map;
* Verify, that variety can read and use(re-print to stdout) passed parameters like limit, sort, query and maxDepth.
public class ParametersParsingTest {
private Variety variety;
public void setUp() throws Exception {
this.variety = new Variety("test", "users");
public void tearDown() throws Exception {
* Verify default parameters of variety.
public void verifyDefaultResultsStdout() throws Exception {
final VarietyAnalysis analysis = variety.runAnalysis();
final Map<String, String> params = getParamsMap(analysis.getStdOut());
Assert.assertEquals("99", params.get(Variety.PARAM_MAXDEPTH));
Assert.assertEquals("{ }", params.get(Variety.PARAM_QUERY));
Assert.assertEquals("{ \"_id\" : -1 }", params.get(Variety.PARAM_SORT));
Assert.assertEquals("5", params.get(Variety.PARAM_LIMIT)); // TODO: why is limit configured to current count, not set as 'unlimited'? It could save one count query
* Verify, that all passed parameters are correctly recognized and printed out in stdout of variety.
public void verifyRestrictedResultsStdout() throws Exception {
final VarietyAnalysis analysis = variety
final Map<String, String> params = getParamsMap(analysis.getStdOut());
Assert.assertEquals("5", params.get(Variety.PARAM_MAXDEPTH));
Assert.assertEquals("{ \"name\" : \"Harry\" }", params.get(Variety.PARAM_QUERY));
Assert.assertEquals("{ \"name\" : 1 }", params.get(Variety.PARAM_SORT));
Assert.assertEquals("2", params.get(Variety.PARAM_LIMIT));
* Verify, that variety recognizes unknown or empty collection and exists. In stdout should be recorded reason.
public void testUnknownCollectionResponse() throws Exception {
this.variety = new Variety("test", "--unknown--");
try {
variety.runAnalysis();"It should throw exception");
} catch (final RuntimeException e) {
Assert.assertTrue(e.getMessage().contains("does not exist or is empty"));
* @param stdout Text from mongo shell, containing variety config output + json results
* @return Map of config values
private Map<String, String> getParamsMap(final String stdout) {
return Stream.of(stdout.split("\n"))
.filter(line -> line.startsWith("Using "))
.map(v -> v.replace("Using ", ""))
.collect(Collectors.toMap(k -> k.split(" of ")[0], v -> v.split(" of ")[1]));
package com.github.variety.test;
import com.github.variety.Variety;
import com.github.variety.VarietyAnalysis;
import org.junit.After;
import org.junit.Assert;
import org.junit.Before;
import org.junit.Test;
public class QueryLimitedAnalysisTest {
private Variety variety;
public void setUp() throws Exception {
variety = new Variety("test", "users");
public void tearDown() throws Exception {
public void testQueryLimitedAnalysis() throws Exception {
final VarietyAnalysis analysis = variety.withQuery("{someBinData:{$exists: true}}").runAnalysis();
Assert.assertEquals(3, analysis.getResultsCollection().count());
// TODO: are those percentContaining numbers correct? Should percents be limited to all data or query data?
analysis.verifyResult("_id", 1, 20, "ObjectId");
analysis.verifyResult("name", 1, 20, "String");
analysis.verifyResult("someBinData", 1, 20, "BinData-old");
package com.github.variety.test;
import com.mongodb.BasicDBObjectBuilder;
import com.mongodb.DBObject;
import org.bson.types.Binary;
import java.time.LocalDate;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
class SampleData {
* Java representation of sample collection provided in variety README:<p>
* {name: "Tom", bio: "A nice guy.", pets: ["monkey", "fish"], someWeirdLegacyKey: "I like Ike!"}<p>
* {name: "Dick", bio: "I swordfight.", birthday: new Date("1974/03/14")}<p>
* {name: "Harry", pets: "egret", birthday: new Date("1984/03/14")}<p>
* {name: "Geneviève", bio: "Ça va?"}<p>
* {name: "Jim", someBinData: new BinData(2,"1234")}<p>
public static List<DBObject> getDocuments() {
final List<DBObject> examples = new ArrayList<>();
new BasicDBObjectBuilder()
.add("name", "Tom")
.add("bio", "A nice guy.")
.add("pets", Arrays.asList("monkey", "fish"))
.add("someWeirdLegacyKey", "I like Ike!")
new BasicDBObjectBuilder()
.add("name", "Dick")
.add("bio", "I swordfight.")
.add("birthday", LocalDate.of(1974, 3, 14).toString())
new BasicDBObjectBuilder()
.add("name", "Harry")
.add("pets", "egret")
.add("birthday", LocalDate.of(1984, 3, 14).toString())
new BasicDBObjectBuilder()
.add("name", "Geneviève")
.add("bio", "Ça va?")
new BasicDBObjectBuilder()
.add("name", "Jim")
.add("someBinData", new Binary((byte) 0x02, new byte[]{1,2,3,4}))
return examples;
package com.github.variety.test;
import com.github.variety.Variety;
import com.github.variety.VarietyAnalysis;
import org.junit.After;
import org.junit.Assert;
import org.junit.Before;
import org.junit.Test;
* Verify, that variety can handle sort parameter and analyse collection in given order. It is useful only when
* used together with limit.
public class SortedAnalysisTest {
private Variety variety;
public void setUp() throws Exception {
variety = new Variety("test", "users");
public void tearDown() throws Exception {
public void testSortedAnalysis() throws Exception {
// Sort without limit or other query should not modify results itself. Analysis is done on the same data, only in another order.
final VarietyAnalysis analysis = variety.withSort("{name:-1}").runAnalysis();
analysis.verifyResult("_id", 5, 100, "ObjectId");
analysis.verifyResult("name", 5, 100, "String");
analysis.verifyResult("bio", 3, 60, "String");
analysis.verifyResult("pets", 2, 40, "String", "Array");
analysis.verifyResult("someBinData", 1, 20, "BinData-old");
analysis.verifyResult("someWeirdLegacyKey", 1, 20, "String");
public void testSortedAnalysisWithLimit() throws Exception {
// when sorting default SampleData by name desc, first entry becomes Tom. He is only with key 'someWeirdLegacyKey'
// Together with applying limit 1, Tom is the only result in analysis. That gives us chance to assume keys and verify
// that ordering is correct.
final VarietyAnalysis analysis = variety.withSort("{name:-1}").withLimit(1).runAnalysis();
Assert.assertEquals(5, analysis.getResultsCollection().count());
// TODO: are those percentContaining numbers correct? Should percents be limited to all data or query data?
// Why total counts are always 5, when 'someWeirdLegacyKey' has only one object?
// Keys and types are correct, total count and percents seems not right.
analysis.verifyResult("_id", 5, 100, "ObjectId");
analysis.verifyResult("name", 5, 100, "String");
analysis.verifyResult("bio", 5, 100, "String");
analysis.verifyResult("pets", 5, 100, "Array");
analysis.verifyResult("someWeirdLegacyKey", 5, 100, "String");
package com.github.variety.test;
import com.github.variety.Variety;
import com.github.variety.VarietyAnalysis;
import com.mongodb.DBObject;
import com.mongodb.util.JSON;
import org.junit.After;
import org.junit.Before;
import org.junit.Test;
* Test, how variety handles objects, that are not named (for example objects inside array).
* It addresses behavior described in issue
public class UnnamedObjectsAnalysisTest {
private Variety variety;
public void setUp() throws Exception {
this.variety = new Variety("test", "users");
variety.getSourceCollection().insert((DBObject) JSON.parse("{title:'Article 1', comments:[{author:'John', body:'it works', visible:true }]}"));
variety.getSourceCollection().insert((DBObject) JSON.parse("{title:'Article 2', comments:[{author:'Tom', body:'thanks'}]}"));
public void tearDown() throws Exception {
public void testUnnamedObjects() throws Exception {
final VarietyAnalysis analysis = variety.runAnalysis();
analysis.verifyResult("_id", 2, 100, "ObjectId");
analysis.verifyResult("title", 2, 100, "String");
analysis.verifyResult("comments", 2, 100, "Array");
// TODO: current version of variety is not able to handle unnamed inside objects. Earlier they were marked with XX. key prefix.
// Now the unnamed object are skipped and not analysed at all. Example of earlier version results can be seen
// in issue
// There should be 6 different keys: _id, title, comments and three from anonymous objects:, comments.XX.body, comments.XX.visible
// FIXME: Assert.assertEquals(6, analysis.getResultsCollection().count());
// FIXME: analysis.verifyResult("", 2, 100, "String");
// FIXME: analysis.verifyResult("comments.XX.body", 2, 100, "String");
// FIXME: analysis.verifyResult("comments.XX.visible", 1, 50, "Boolean");
# Need to link globally installed libraries to local directory. Without this, it is not possible to call 'require(lib)'
# in source codes (jasmine tests). This could be replaced by configuring package.json (define dependencies, test script)
npm link mongodb
# current script directory, used as relative path to test resources
DIRNAME=`dirname $0`
# cumulative return code
# init scripts, create test data used by variety
mongo test < $DIRNAME/init.js
# check, that import finished correctly, otherwise exit (TODO: should we try to cleanup resources?)
if [ $? -ne 0 ]; then
echo "Failed to initialize tests from file $DIRNAME/init.js"
exit 1
# run variety itself. Analyze collection users in database test
mongo test --eval "var collection = 'users'" $DIRNAME/../variety.js
# in case of fail do not exit, just log, set return code and continue to cleanup
if [ $? -ne 0 ]; then
echo "Failed to execute variety"
if [ $RETURNCODE -eq 0 ]; then
echo "Running jasmine tests for variety"
jasmine-node $DIRNAME/../test --verbose --captureExceptions
if [ $? -ne 0 ]; then
echo "There ware test errors, see log above!"
echo "Tests finished, no problem detected"
echo "tests skipped because of fail when run variety analyzer"
#cleanup resources
mongo test < $DIRNAME/cleanup.js
if [ $? -ne 0 ]; then
echo "Failed to cleanup test resources"
\ No newline at end of file
var mongo = require('mongodb');
var CONNECTION_STRING = 'mongodb://';
var mongoClient = mongo.MongoClient;
var withVarietyDb = function (callback) {
mongoClient.connect(CONNECTION_STRING, function (err, db) {
if (err) throw err;
var verifyVarietyResultEntry = function (keyName, expectedType, occurrencesCount, percentContaining, doneCallback) {
withVarietyDb(function (db) {
var collection = db.collection(TESTED_COLLECTION_NAME);
collection.findOne({'_id.key': keyName}, function (err, result) {
if (err) throw err;
if (result != null) {
describe("Variety results", function () {
it("should verify correct count of results", function (done) {
withVarietyDb(function (db) {
db.collection(TESTED_COLLECTION_NAME).count(function (err, count) {
if (err) throw err;
it("should verify correct '_id' result", function (done) {
verifyVarietyResultEntry("_id", ["ObjectId"], 5, 100, done);
it("should verify correct 'name' result", function (done) {
verifyVarietyResultEntry("name", ["String"], 5, 100, done);
it("should verify correct 'name' result", function (done) {
verifyVarietyResultEntry("bio", ["String"], 3, 60, done);
it("should verify correct 'pets' result", function (done) {
verifyVarietyResultEntry("pets", [ "String", "Array" ], 2, 40, done);
it("should verify correct 'birthday' result", function (done) {
verifyVarietyResultEntry("birthday", ["Date"], 2, 40, done);
it("should verify correct 'someBinData' result", function (done) {
verifyVarietyResultEntry("someBinData", ["BinData-old"], 1, 20, done);
it("should verify correct 'someWeirdLegacyKey' result", function (done) {
verifyVarietyResultEntry("someWeirdLegacyKey", ["String"], 1, 20, done);
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment