We know that HDFS supports Kerberos authentication , but how does HDFS map the Kerberos principals to local unix/linux usernames?
Before we begin this topic, lets first understand what a Kerberos principal consists of ,
For example given following principal,we can see there are two componets
there is a realm component DOMAIN.COM and user component hdfs , sometimes user component will comprise of two sub components separated by / . hdfs/admin@DOMAIN.COM
Why convert Kerberos principal to local user ?
HDFS uses ShellBasedUnixGroupsMapping by default , which means it uses linux/unix commands to fetch the group details of a particular user (i.e id username). The group details are further utilized for access control check on HDFS files and folders.
How does HDFS convert principal to local user ?
HDFS translates Kerberos principals using set of regex rules defined in core-site.xml. The hadoop.security.auth_to_local property contains the regex rules.
The default rule is to strip the realm name from the principal ,
i.e hdfs@DOMAIN.COM is converted to hdfs.
The regex pattern is similar to regex in Perl.
Lets look at one of the translation rule, it has 3 parts base , filter and substitution
The base uses $0 to represent the realm, $1 for the first component and $2 means the second component in username.
consider principal hdfs/admin@DOMAIN.COM ,
here DOMAIN.COM is $0, $1 is admin and $2 is admin.
In the following example we are filtering hdfs@DOMAIN.COM
Finally substituting the hdfs@DOMAIN.COM with hdfs. (/hdfs/)
How to test your rules ?
Use the following command to test your regex translation rules
hadoop org.apache.hadoop.security.HadoopKerberosName hdfs@DOMAIN.COM
Name: hdfs@DOMAIN.COM to hdfs
If there are no rules defined the command fails with following error
hadoop org.apache.hadoop.security.HadoopKerberosName hdfs@TEST.COM
Exception in thread “main” org.apache.hadoop.security.authentication.util.KerberosName$NoMatchingRule: No rules applied to hdfs@TEST.COM