Kerberos Authentication in a Hadoop Cluster
What is Kerberos?
Kerberos is a network authentication protocol.
It is designed to provide strong authentication for client/server applications. In Hadoop, there is two type of authentication.
1. Simple
2. Kerberos
Simple is the default authentication that Hadoop uses, for Kerberos you need to setup Kerberos in your Hadoop cluster.
Setting up Kerberos
For setting up Kerberos on cluster one node act as KDC(Key Distribution Centre) server and other act as client or workstations. Kerberos is sensitive in some cases so there is some prerequisite that we need to take care before installing Kerberos on all the nodes.
1. Install NTP: In a Kerberos client/server setup, all the servers need to be in the same time zone with synchronized date and time as Kerberos is a time sensitive protocol because its authentication is based partly on the timestamps of the tickets. Follow below command for NTP installation on a Centos machine.
yum -y install ntp
ntpdate 0.rhel.pool.ntp.org
servive ntpd.service start
2. DNS Lookup: Configure DNS correctly and test that forward and reverse DNS lookup work. Follow below commands for DNS setup.
yum install bind-utils
For Forward Lookup:
nslookup <fqdn>
For Reverse Lookup:
nslookup <Public Ip>
KDC Installation:
Packages required for kdc setup are.
1. kkrb5-server : kdc server
2. krb5-libs : admin package
3. krb5-workstation - client packege
Run the following command for installing these packages.
yum -y install krb5-server krb5-libs krb5-workstation
Configuring KDC.
This setup will require one server that will act as Kerberos kdc server and all other nodes in a server will act as clients and kdc server can also act as a client in a single node setup. Let's assume that we have two server setup, one server act as kdc server and other one is the client.
Let's assume that our domain name is (random.com) and fully qualified hostname for our kdc server is.
kerberos.random.com
There are few files that need to change according to your configuration. Generally, kdc is installed on the path "/var/kerberos/".
1. krb5.conf - It is present on path "/var/kerberos/krb5kdc". Ensure that your default realm is set to your domain name in the capital case.
[logging]
default = FILE:/var/log/krb5libs.log
kdc = FILE:/var/log/krb5kdc.log
admin_server = FILE:/var/log/kadmind.log
[libdefaults]
default_realm = RANDOM.COM
dns_lookup_realm = false
dns_lookup_kdc = false
ticket_lifetime = 24h
renew_lifetime = 7d
forwardable = true
[realms]
EXAMPLE.COM = {
kdc = kerberos.random.com
admin_server = kerberos.random.com
}
[domain_realm]
.random.com = RANDOM.COM
random.com = RANDOM.COM
Here in realm section, we set kdc to the hostname of the kdc server.
The first entry in domain_realm maps all hosts under the domain .random.com into the RANDOM.COM realm, but not the host with the name random.com That host is matched by the second entry. Remember this file signifies client side so it should be present on all clients.
2. kdc.conf - It is present on path "/var/kerberos/krb5kdc/" .Change the realm name here.
default_realm = RANDOM.COM
[kdcdefaults]
kdc_ports = 88
kdc_tcp_ports = 88
[realms]
RANDOM.COM = {
#master_key_type = aes256-cts
acl_file = /var/kerberos/krb5kdc/kadm5.acl
dict_file = /usr/share/dict/words
admin_keytab = /var/kerberos/krb5kdc/kadm5.keytab
supported_enctypes = aes256-cts:normal aes128-cts:normal des3-hmac-sha1:normal arcfour-hmac:normal des-hmac-sha1:normal des-cbc-md5:normal des-cbc-crc:normal
}
3. kadm5.acl- It is present on path "/var/kerberos/krb5kdc/". Also, change the realm.
*/admin@RANDOM.COM *
This line signifies that any principal in the RANDOM.COM realm with an admin instance has all administrative privileges(* signifies admin privileges). For more examples refer this https://web.mit.edu/kerberos/krb5-1.12/doc/admin/conf_files/kadm5_acl.html.
Kdc Server is now configured Let's create Database for KDC to hold all the principals, below is the command for creating the database for your realm, this command will ask you for the password which it will stash so that you don't have to enter it in future.
kdb5_util create -r RANDOM.COM -s
Now on KDC server create an admin user with the following command.kadmin.local
It will open kadmin bash.
kadmin.local: addprinc root/admin
kadmin.local: ktadd -kt /var/kerberos/krb5kdc/kadm5.keytab kadmin/admin
kadmin.local: ktadd -kt /var/kerberos/krb5kdc/kadm5.keytab kadmin/changepw
kadmin.local: exit
"addprinc root/admin" will add a principal with root permissions,
"ktadd -kt <keytab> <principal>" will add the principal to the keytab file which is used by the client. After that just start the Kerberos with the following commands.
service krb5kdc start
service kadmin start
Now that our kdc server is up and running lets setup our clients which in our case is a Hadoop cluster.
Note that while configuring Hadoop in secure mode, each user and service needs to be authenticated by Kerberos in order to use Hadoop services.
It is recommended that each Hadoop service runs as a different Unix user eg for HDFS and YARN we can have hdfs and yarn as users.
For running Hadoop service daemons in Hadoop in secure mode, Kerberos principals are required. Each service reads authenticate information saved in the keytab file with permissions of only that particular service user.
Configuring HDFS:
For hdfs we will be creating keytab files each for namenode, secondary namenode and datanode. For creating keytab follow steps.
1. Connect to KDC server and add principals for the respective service eg. for namenode
kadmin -p root/admin
This command will connect you to kdc server and ask for the password (root/admin is principal we added above) and will open kadmin bash.
2. Add a principal nn/full.qualified.domain.name@RANDOM.COM to kdc database with below step, also add principal for host
addprinc -randkey nn/full.qualified.domain.name@RANDOM.COM
addprinc -randkey host/full.qualified.domain.name@RANDOM.COM
Note: -randkey will generate password automatically for the principal
3. Create a keytab file for namenode.
ktadd -kt /etc/security/keytab/nn.service.keytab nn/full.qualified.domain.name@RANDOM.COM
ktadd -kt /etc/security/keytab/nn.service.keytab host/full.qualified.domain.name@RANDOM.COM
ktadd -kt /etc/security/keytab/nn.service.keytab HTTP/full.qualified.domain.name@RANDOM.COM
4. You can verify the keytab content with below command.
klist -e -k -t /etc/security/keytab/nn.service.keytab
Give this keytab file a permission for that particular user(In this case its hdfs)
Do similar steps for Secondary Namenode and Datanode.
Configure hdfs-site.xml for kerberos.
Namenode Properties:
Note: Hadoop will automatically infer _HOST from the server.
Journalnode Properties:
Datanode Properties:
Yarn(ResourceManger) Properties:
Here in principal name "nn/_HOST@REALM.TLD" nn represent the username, we need to map this name with our hdfs user and respectively, for others, this can be done by adding following properties.
Similarly, add this for every principal associated with that particular user.
Configure core-site.xml for Kerberos.
Kerberos Authentication
Reviewed by Unknown
on
2:54 AM
Rating:
Reviewed by Unknown
on
2:54 AM
Rating:

No comments: