Before coding in any platform it helps to know the jargon used to describe things. In this video you learn all the HBase terminology necessary to get started.
- [Instructor] Now let's take a look at how HBase organizes data in its data model. And this is key to understanding how to work with this data going forward. If we think of our employee table and want to build a model around that, first we have our table itself. Now the table is merely a collection of rows, and each row is identified by a row key. This is how HBase will identify that individual entity and provide random access for querying. Inside of that row, we have different column families.
First we have our column family for work, which contains attributes about our employee's job. Then we have the column family for demographic information. This column family has information such as their name and their age. Inside of the column families, we have the columns and values themselves. Now this is only a single row in a single table, but it scales for every single row of every employee. The idea with column families is that you group attributes together since you most likely will be returning them in the same query.
For example, we may wannna pull in the number of employees in the data department and their average salary. We may not in this case care about their gender, age or names since we're only looking at salaries and which department their in. We can also create new tables that only store these elements which would be even faster and better for query performance but would take overhead in order to maintain. One thing we didn't touch on yet was the notion or concept of the actual database itself. In HBase, we have something know as a namespace, and the namespace are logical groupings of tables and allow us to refer to them in a similar fashion we would a relational database which has a name.
Another concept which is extremely useful here is versioning. Now versioning happens at the individual value level and we'll take a look at this a little bit closer coming up. I didn't mention it yet because we're going to really find some examples here and show how this works. The versioning has another benefit in that it gives us the ability to look back in time. Imagine if you wanted to know how many employees were active in a given department six months ago. Your HR database or wherever you put this information may not have tracked the timestamps of when people move in and out of the department or when they come and go from the company.
However with HBase and the built-in versions, you can ask for these specific versions that were active during that time and find answers to these historical analytical questions.
This course can help professionals further their career in big data analytics using HBase and the Hadoop framework. Learn to describe HBase in the context of the NoSQL landscape, build simple architecture models, and explore basic HBase commands. Instructor Ben Sullins shows how all the concepts fit together, resulting in the kind of distributed big data storage you need for scalable, enterprise-level applications.
- What is HBase?
- Who uses HBase?
- Comparing HBase and an RDBMS
- How data is stored in HBase
- Data model operations
- HBase architecture
- Creating tables
- Querying data