The Impala query engine can be confusing at times. This video deconstructd how these work so you can get your results faster.
- [Instructor] Now I want to figure out…how to see what the Impala engine is doing…when I write a query.…I like to think of this as deconstructing…these Impala queries.…We'll start out with some basics,…then we'll get into some more difficult types of queries,…and I'll explain what's happening throughout that…and show you how you can inspect it,…which is really helpful…if you're running into any performance issues.…First, let's just do a basic query from the table we built.…Select * from customer_orders, limit 100.…
You can see I have my results there.…If I add explain to the beginning of this,…I'll get a query plan.…If I scroll down, you can see how this works.…If you haven't ran any queries yet,…you'll need to do compute stats,…so let's just do that first.…I'll do this on a new line and put compute stats…on our table here, default.customer_orders.…Run this, and hit play.…That way it'll calculate everything…that's going on with that table.…
Then when we run our query again, we'll get the stats here.…You read this from the bottom to the top.…
- Explain which commands are used to make changes in HDFS.
- Identify the commands used to upload data from the command line to the HDFS.
- Recognize two operations the HDFS performs when a user moves files.
- Summarize how to remove files recursively in HDFS.
- Recall how to select and implement partitions.
- Explain how to flatten a Struct data type in HiveQL.