Tuning SQL queries

While SQL query tuning is a broad subject overall, Cloudera recommends the following SQL best practices.

By: Manish Maheshwari, Data Architect and Data Scientist at Cloudera, Inc.

  • Hadoop and Impala are best suited for star schema data models over third normal form (3NF) models. While Impala can work efficiently with 3NF models, the lesser number of joins and wider tables used in star schema models typically corresponds to faster query execution times.
  • Use integer join keys rather than character or data join keys.
  • Always use columns of the varchar data type rather than of the character data type.
  • Avoid using either implicit or explicit casts.