Instructions for producing the tables: a. Copy the Javascript version of results for each mongodb query listed below into a text editor. b. OPTIONAL: Then regex replace \/\*\s*\d+\s*\*\/ with a comma (','), remove the very first comma, and embed all the JS inside []. c. Paste that Javascript into https://json-csv.com/ to get the CSV tables Note that for step 6, there are 2 mongodb queries. The results of both have to be merged into a single csv file. ----------- 1. Table 1 db.Websites.aggregate([ { $unwind: "$geoLocationCountryCode" }, { $group: { _id: "$geoLocationCountryCode", count: { $sum: 1 }, /*domain: { $addToSet: '$domain' },*/ numPagesInMRICount: { $sum: '$numPagesInMRI' }, numPagesContainingMRICount: { $sum: '$numPagesContainingMRI' }, totalPagesAcrossSites: { $sum: '$totalPages'} } }, { $sort : { count : -1} } ]); 1a. db.Websites.aggregate([ { $match: {urlContainsLangCodeInPath: true} }, { $unwind: "$geoLocationCountryCode" }, { $group: { _id: "$geoLocationCountryCode", count: { $sum: 1 }, /*domain: { $addToSet: '$domain' },*/ numPagesInMRICount: { $sum: '$numPagesInMRI' }, numPagesContainingMRICount: { $sum: '$numPagesContainingMRI' }, totalPagesAcrossMatchingSites: { $sum: '$totalPages'} } }, { $sort : { count : -1} } ]); 1b. db.Websites.aggregate([ {$match: {urlContainsLangCodeInPath: false} }, { $unwind: "$geoLocationCountryCode" }, { $group: { _id: "$geoLocationCountryCode", count: { $sum: 1 }, /*domain: { $addToSet: '$domain' },*/ numPagesInMRICount: { $sum: '$numPagesInMRI' }, numPagesContainingMRICount: { $sum: '$numPagesContainingMRI' }, totalPagesAcrossMatchingSites: { $sum: '$totalPages'} } }, { $sort : { count : -1} } ]); ----------- 2. Table 2 db.Websites.aggregate([ { $match: { numPagesInMRI: {$gt: 0} } }, { $unwind: "$geoLocationCountryCode" }, { $group: { _id: {$toLower: '$geoLocationCountryCode'}, count: { $sum: 1 }, /*domain: { $addToSet: '$domain' },*/ numPagesInMRICount: { $sum: '$numPagesInMRI' }, numPagesContainingMRICount: { $sum: '$numPagesContainingMRI' }, totalPagesAcrossSitesWithPositiveMRICount: { $sum: '$totalPages'} } }, { $sort : { count : -1} } ]); ----------- 3. Table 3 db.Websites.aggregate([ { $match: { numPagesContainingMRI: {$gt: 0} } }, { $unwind: "$geoLocationCountryCode" }, { $group: { _id: {$toLower: '$geoLocationCountryCode'}, count: { $sum: 1 }, /*domain: { $addToSet: '$domain' },*/ numPagesInMRICount: { $sum: '$numPagesInMRI' }, numPagesContainingMRICount: { $sum: '$numPagesContainingMRI' }, totalPagesAcrossSitesWithPositiveContainsMRI: { $sum: '$totalPages'} } }, { $sort : { count : -1} } ]); ----------- 4. Table 4 db.Websites.aggregate([ { $match: { $and: [ {numPagesContainingMRI: {$gt: 0}}, {$or: [{geoLocationCountryCode: /(NZ|AU)/}, {domain: /\.nz$/}, {urlContainsLangCodeInPath: false}]} ] } }, { $unwind: "$geoLocationCountryCode" }, { $group: { _id: {$toLower: '$geoLocationCountryCode'}, count: { $sum: 1 }, /*domain: { $addToSet: '$domain' },*/ numPagesInMRICount: { $sum: '$numPagesInMRI' }, numPagesContainingMRICount: { $sum: '$numPagesContainingMRI' }, totalPagesAcrossMatchingSites: { $sum: '$totalPages'} } }, { $sort : { count : -1} } ]); ----------- 5. Table 5 Outside of NZ: db.Websites.aggregate([ { $match: { $and: [ {geoLocationCountryCode: {$ne: "NZ"}}, {domain: {$not: /\.nz$/}}, {numPagesContainingMRI: {$gt: 0}}, {$or: [{geoLocationCountryCode: "AU"}, {urlContainsLangCodeInPath: false}]} ] } }, { $unwind: "$geoLocationCountryCode" }, { $group: { _id: {$toLower: '$geoLocationCountryCode'}, count: { $sum: 1 }, /*domain: { $addToSet: '$domain' },*/ numPagesInMRICount: { $sum: '$numPagesInMRI' }, numPagesContainingMRICount: { $sum: '$numPagesContainingMRI' }, totalPagesAcrossMatchingSites: { $sum: '$totalPages'} } }, { $sort : { count : -1} } ]); NZ: db.Websites.aggregate([ { $match: { $and: [ {numPagesContainingMRI: {$gt: 0}}, {$or: [{geoLocationCountryCode:"NZ"},{domain: /\.nz$/}]} ] } }, { $unwind: "$geoLocationCountryCode" }, { $group: { _id: "nz", count: { $sum: 1 }, /*domain: { $addToSet: '$domain' },*/ numPagesInMRICount: { $sum: '$numPagesInMRI' }, numPagesContainingMRICount: { $sum: '$numPagesContainingMRI' }, totalPagesAcrossMatchingSites: { $sum: '$totalPages'} } }, { $sort : { count : -1} } ]); To find NZ web pages IN MRI the following may be BETTER, as it looks for sites with positive numPagesINMRI rather than sites that only have positive containingMRI: db.Websites.aggregate([ { $match: { $and: [ {numPagesInMRI: {$gt: 0}}, {$or: [{geoLocationCountryCode:"NZ"},{domain: /\.nz$/}]} ] } }, { $unwind: "$geoLocationCountryCode" }, { $group: { _id: "nz", count: { $sum: 1 }, domain: { $addToSet: '$domain' }, numPagesInMRICount: { $sum: '$numPagesInMRI' }, numPagesContainingMRICount: { $sum: '$numPagesContainingMRI' }, totalPagesAcrossMatchingSites: { $sum: '$totalPages'} } }, { $sort : { count : -1} } ]); 5b. Table 5b: Table of count of sites with numPagesContainingMRI > 0 Combine the following two: - OVERSEAS db.Websites.aggregate([ { $match: { $and: [ {geoLocationCountryCode: {$ne: "NZ"}}, {domain: {$not: /\.nz$/}}, {numPagesContainingMRI: {$gt: 0}} ] } }, { $unwind: "$geoLocationCountryCode" }, { $group: { _id: {$toLower: '$geoLocationCountryCode'}, count: { $sum: 1 }, /*domain: { $addToSet: '$domain' },*/ numPagesInMRICount: { $sum: '$numPagesInMRI' }, numPagesContainingMRICount: { $sum: '$numPagesContainingMRI' }, totalPagesAcrossMatchingSites: { $sum: '$totalPages'} } }, { $sort : { count : -1} } ]); - NZ: db.Websites.aggregate([ { $match: { $and: [ {numPagesContainingMRI: {$gt: 0}}, {$or: [{geoLocationCountryCode:"NZ"},{domain: /\.nz$/}]} ] } }, { $unwind: "$geoLocationCountryCode" }, { $group: { _id: "nz", count: { $sum: 1 }, /*domain: { $addToSet: '$domain' },*/ numPagesInMRICount: { $sum: '$numPagesInMRI' }, numPagesContainingMRICount: { $sum: '$numPagesContainingMRI' }, totalPagesAcrossMatchingSites: { $sum: '$totalPages'} } }, { $sort : { count : -1} } ]);