In this video, learn how to read files using corpus reader.
- [Narrator] NLTK supports a special sort of functions … for reading a list of files into a corpus. … It comes as part of the NLTK corpus reader package. … More details of the same, can be found in the NLTK website. … Please install the NLTK package, … if you have not done so before, … using the pip install nltk command. … We will also download the punkt package … that we will use later in the examples. … The plain text Corpus Reader is used to read … a list of text files under a directory. … Each text file becomes a single file id … but the contents of the file are (mumbles) together … in to a single corpus. … Data is then split in to paragraphs, sentences, … and tokens automatically, while the corpus is read. … In this example, we need the same … "Spark-Course-Description.txt" file into the corpus. … The raw contents of the files are then printed … using the corpus.raw command. … Let us run the code now and see the output. … …
- Text mining today
- Reading text files using Python
- Cleansing text data
- Build n-grams databases for text predictions
- Preparing TF-IDF matrices for machine learning
- Scaling text processing for performance
Skill Level Intermediate
1. Text Mining
2. Reading Text
3. Text Cleansing and Extraction
4. Advanced Text Processing
5. Best Practices
- Mark as unwatched
- Mark all as unwatched
Are you sure you want to mark all the videos in this course as unwatched?
This will not affect your course history, your reports, or your certificates of completion for this course.Cancel
Take notes with your new membership!
Type in the entry box, then click Enter to save your note.
1:30Press on any video thumbnail to jump immediately to the timecode shown.
Notes are saved with you account but can also be exported as plain text, MS Word, PDF, Google Doc, or Evernote.