From the course: Advanced NoSQL for Data Science
Unlock the full course today
Join today to access over 22,400 courses taught by industry experts or purchase this course individually.
Prepare data with document databases - NoSQL Tutorial
From the course: Advanced NoSQL for Data Science
Prepare data with document databases
- [Instructor] Now it's time to take a look at how we can prepare data for use with document databases. It's quite common to have to load data from comma separated value files, or CSV files for short. Tab separated value files, also known as TSV files, are also frequently used for data transfers. One way to work with these files is to use a scripting language like Python. Python has two useful libraries for working with text and JSON files. These are called the CSV and JSON libraries or packages. These libraries have functions for reading and writing to these formats. The csv.DictReader function in the CSV library is especially useful for reading lines from a tabular file into a Python dictionary. These data structures are then easily mapped to JSON. We can use the dump function in the JSON library to write a list of dictionaries to a JSON file which can then be loaded to a document database. Now we don't have to write custom scripts if we don't want to. Document databases like…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.
Contents
-
-
-
-
-
Document data models1m 35s
-
JSON structures1m 53s
-
Prepare data with document databases3m 43s
-
Install Anaconda1m 34s
-
Install MongoDB2m 38s
-
Working with Jupyter2m 43s
-
Explore data with document databases5m 4s
-
Extract data with document databases5m 50s
-
Perform quality checks5m 43s
-
Index data with document databases2m 20s
-
Data frames in MongoDB4m 48s
-
Tips for using document databases for data science2m 6s
-
-
-
-