Export Data from MongoDB

Are you looking for exporting data out of your MongoDB? In this article, we will look through various options to export data from MongoDB. Here we will look through a simple copy from the terminal, export the entire collection, specific fields, database queries and even data from mongo aggregate. Let's get started.

If you want to follow along, you can create an account with MongoDB. Once login creates an organization called MDBU, create a Project within MDBU called M001, then create a Free Tire Atlas cluster named Sandbox. These are the steps from the M001 course at MongoDB University. Which is free to learn as well. Once the cluster is set up, click on Load Sample Dataset, so load the DB with some sample data.

Simple Copy From Terminal

This step is pretty simple and obvious, which can be excluded, but for simple queries and a few no of data, it can be used. For this connect to your database using mongo shell

> mongo "mongodb+srv://cluster0.xxxxx.mongodb.net/db" --username user

> show dbs
> use sample_training
> show collections
> db.posts.find({})

This will return 20 posts, you can fill the query and get the exact data out of it. If the document has lots of fields, we can filter the fields as well.

> db.posts.find({author:"machine"}, {title:1, author:1})

this will return all (20) the posts by author "machine", it will return only the _id, title and author. We can filter fields either to include {title:1} or exclude the field {title:0}. We cannot mix include and exclude in same query. But for _id. db.posts.find({author:"machine"}, {title:1, author:1, _id:0}) is valid. _id is selected by default.

Printing mongo output to a file

If the results are more than 20, we need to type "it" to get the next 20 and so on. If there are no more but a few (few 20s) results, instead of copying the data manually, we can save the output to a file.

> mongo "mongodb+srv://cluster0.xxxxx.mongodb.net/db" --username user | tee mongo.txt
> mongo | tree forLocal.txt // <- for local mongo

this will save all contents into the file mongo.txt, now you can manipulate the file as required.

The above-mentioned two methods are not the best way to get the data, but if there are only a few data to copy, it will still be useable and fast (🙃 maybe). Now let's get into mongoexport.

mongoexport

mongoexport is a command-line tool that produces a JSON or CSV export of data stored in a MongoDB instance.

Error : error parsing command line options: error parsing uri: lookup cluster0.xxxxx.mongodb.net on 127.0.0.53:53: cannot unmarshal DNS messag
Resolve: update nameserver to 8.8.8.8 on /etc/reslove.conf

you may encounter some error as above. 
Let's export the posts collection from the sample_training collection.
mongoexport --uri mongodb+srv://<user>:<password>@cluster0.xxxxx.mongodb.net/sample_training --collection posts --type json --out posts.json
eg.
> mongoexport --uri mongodb+srv://sita:Ram%40123@cluster0.xxxxx.mongodb.net/sample_training --collection posts --type json --out posts.json

use your username and password, If your password has special characters, it should be URL encoded. In the above example, the password is Ram@123 and it is encoded as Ram%40123.



mongoexport with query

We can export the documents those satisfied the selected criteria. 

> mongoexport --uri mongodb+srv://sita:Ram%40123@cluster0.xxxxx.mongodb.net/sample_training --collection posts --query '{"author":"machine"}' --type json --out machinesPosts.json

With the query option we can pass the query for the document. query should be enclosed in single quotes and the query should be a valid JSON, key should be enclosed with quotes.

mongoexport with selected fields only

we can specify which fields select while exporting the data as well.

> mongoexport --collection posts --query --fields 'title,author' --type json --out postsTitles.json

For json type export, mongoexport includes the specified fields and the _id as well. If a field from subdocument in included in the fields, it will include all the subdocument, not just the field in the subdocument.

mongoexport to CSV

with the type option we can specify whether we want to download CSV or json file.

> mongoexport ..... --type csv --fields 'title,author' --out posts.csv

JSON export exports the entire collections, but CSV export need to specify the fields to export.

mongoexport to export aggregate pipeline

There is no direct way to run aggregate pipeline on the mongoexport tool, but we can play around to export the aggregate results.

while playing around the sample_training database and the post collections, I found that there are very few unique titles. So, let's group the data by title and count total no of posts on that title. we can achieve that with the following aggregate.

> db.posts.aggregate([{ $group: { _id: "$title", count: { $sum: 1 } } }])

$out

$out Takes the documents returned by the aggregation pipeline and writes them to a specified collection. The $out stage must be the last stage in the pipeline.

So, 
> db.posts.aggregate([{ $group: { _id: "$title", myCount: { $sum: 1 } } }, {$out: "postTitleCount"}])
> show collections

Now the postTitleCount collection is created with the results in it. we can export that collection.

> mongoexport --db sample_training --collection posts --type json --out posts.json

Export from Self Hosted Mongodb in Remote Server

You may have self-hosted MongoDB on some cloud service provider, app and API on the same machine, thus no need to expose the mongo to the outside world. How can we export data out of it? Let's talk about that.
All the commands we work in the above MongoDB Atlas, works with local machine as well.

first, connect to the remote server through SSH.

> mongo --username ram
> mongoexport --username ram --db pathsala --collection courses --type json --out courses.json

With the export, it will create the file in the current directory. Now we can pull the file to our local machine. 
> scp -i key.pem uname@serverIP:/home/uname/courses.json ./

now you got the mongoexport out of your server.

Conclusion

We can get data out of MongoDB in various ways. Which are simple just copying from the terminal or exporting the data using MongoDB database tool called mongoexport and pulling it back to local server through scp. Here we get to know the way to save the aggregate output to a collection and get the collection through the mongoexport. 

Posted by Sagar Devkota.

0 Comments