It looks like managed databases and tables are varying combinations of group ownership by hadoop or hdfs. Likely we want the default group ownership to be analytics-privatedata-users.
Description
Related Objects
Event Timeline
Just chatted with @JAllemandou, here's what we think we should do.
Currently, /user/hive/warehouse has mode 1777 as recommended by Cloudera. This is not what we want. We want the default to be that new dirs and files are read+write by owners, read by analytics-privatedata-users, and not readable by others.
To fix this, we will:
Set hive warehouse group ownership to analytics-privatedata-users, and chmod to 0750. This will cause new databases to be created with proper permissions and ownership.
sudo -u hdfs hdfs dfs -chgrp analytics-privatedata-users /user/hive/warehouse sudo -u hdfs hdfs dfs -chmod 0750 /user/hive/warehouse
Set the same thing for all database dirs. This will cause newly created tables to be created with the proper permissions and ownership.
sudo -u hdfs hdfs dfs -chgrp analytics-privatedata-users /user/hive/warehouse/* sudo -u hdfs hdfs dfs -chmod 0750 /user/hive/warehouse/*
We're not sure what we should do with existing tables and data. The right thing to do would be to set the same perms for those as well. However, there are a lot of user databases and tables, and we might break something for those users.
Let's discuss this as a team next week.
To do this for all files:
sudo -u hdfs hdfs dfs -chgrp -R analytics-privatedata-users /user/hive/warehouse/ sudo -u hdfs hdfs dfs -chmod -R g-w,o-rwxt /user/hive/warehouse/
@Mayakp.wiki asked that her db and tables be owned by analytics-product. Done:
sudo -u hdfs hdfs dfs -chown -R analytics-product /user/hive/warehouse/mayakpwiki.db
Ah, I reverted ^. Maya had meant the tables in wmf_product should be analytics-product owned:
sudo -u hdfs hdfs dfs -chown -R analytics-product /user/hive/warehouse/wmf_product.db