In this video, learn some best practices for optimal storage of text data.
- [Instructor] Storing text data during and after … the text mining process possesses unique challenges … because of its size and lack of structure. … So, what are the best practices to store text data? … Do not try to cram text data into an RDBMS. … Rather, use a suitable big data storage like HDFS, S3, … or Google Cloud Storage to store text data. … References to the storage can then be stored … in RDBMS records. … It is important to be able to query and filter data … in these object stores. … Create indexes on key data elements or words … either in a document database, like MongoDB, … or a search engine, like Elasticsearch. … Another option is to store processed data, … like tokens or TF-IDF arrays, for future consumption. … This reduces the need to process raw text again … while also saving on storage costs. … …
- Text mining today
- Reading text files using Python
- Cleansing text data
- Build n-grams databases for text predictions
- Preparing TF-IDF matrices for machine learning
- Scaling text processing for performance
Skill Level Intermediate
Processing Text with R Essential Trainingwith Kumaran Ponnambalam55m 57s Intermediate
1. Text Mining
2. Reading Text
3. Text Cleansing and Extraction
4. Advanced Text Processing
5. Best Practices
- Mark as unwatched
- Mark all as unwatched
Are you sure you want to mark all the videos in this course as unwatched?
This will not affect your course history, your reports, or your certificates of completion for this course.Cancel
Take notes with your new membership!
Type in the entry box, then click Enter to save your note.
1:30Press on any video thumbnail to jump immediately to the timecode shown.
Notes are saved with you account but can also be exported as plain text, MS Word, PDF, Google Doc, or Evernote.