Tuesday, October 24, 2017

Unix IPTABLE ipforward


In my home lab, I have KVM installed on HP server and i was managing all the virtul machine through my Ubuntu installed laptop using virt-manager client.

but i couldn't access virtual machines in my wifi enabled networks and there is no virt-manager client for windows operating system.

So i tried to work on setup to configure my KVM machine as router to access all the VM within wifi network lan.

here is the my home lab and wifi  networks

KVM machine  has 

one Ethernet card enp4s0f1 connected to 192.168.1.6( wifi network)
another one is kvm bridge network card connected to 192.168.100.1(vm private network)

Windows Laptop:

My Windows Laptop connected to 192.168.1.4 network)

Using putty on Windows laptop, i was able to connect KVM machine directly and not the virtual machines.

And also if i run any applications(webservers/cloudera manager) i was not able access in my windows laptop.

On Windows Laptop

Update the routing table to send all 192.168.100.0/24 (vm private network) traffic to gatway 192.168.1.6(KVM  wifi network ipaddress)

route add 192.168.100.0 mask 255.255.255.0 192.168.1.6

sudo route add -net 192.168.100.0/24 gw  192.168.1.6 ( On Linux Dekstop)

On KVM Machine

Enable the IPFORWARD

update /etc/sysctl.conf file
net.ipv4.ip_forward=1

run this command to reflect the change.
sysctl -p

IF IPTABLE is enabled, update below rule.

-A FORWARD  -i enp4s0f1 -o virbr1 -j ACCEPT

-I FORWARD 1 -j LOG --log-prefix "RULE4:" --log-level 7 ( to enable debug)
-I FORWARD -p tcp --dport 22 -j ACCEPT
-I FORWARD -p tcp --dport 7180 -j ACCEPT ( to access cloudera manager url)
-I FORWARD -m state --state RELATED,ESTABLISHED -j ACCEPT

Make sure firewall stopped in gust host(systemctl stop firewalld)

Wednesday, October 18, 2017

Adding TLS/SSL for Impala Services

Setting up the TLS for Impala is also very simple.

Follow this post OpenSSL CA Authority setup with SAN certificate for Cloudera to create SAN certificate for Impala service in pem format.

update the Pem Key/Certificate/key password/CA PEM file like below.

then restart the impala services.



Since my cluster is kerberos enabled, we need valid tiket to access the impala shell, otherwise you will get following error.

[hive@nm1 ~]$ impala-shell -i node1.tanu.com:21000 -k --ssl --ca_cert=/opt/cloude-newcert/MyRootCA.pem
Starting Impala Shell using Kerberos authentication
Using service name 'impala'
-k requires a valid kerberos ticket but no valid kerberos ticket found.

Now create ticket using kinit command. And use below impala shell, now we are able to access database over TLS.

[hive@nm1 ~]$ kinit 
Password for hive@TANU.COM: 
[hive@nm1 ~]$ impala-shell -i node1.tanu.com:21000 -k --ssl --ca_cert=/opt/cloude-newcert/MyRootCA.pem
Starting Impala Shell using Kerberos authentication
Using service name 'impala'
SSL is enabled
Connected to node1.tanu.com:21000
Server version: impalad version 2.9.0-cdh5.12.1 RELEASE (build 5131a031f4aa38c1e50c430373c55ca53e0517b9)
***********************************************************************************
Welcome to the Impala shell.
(Impala Shell v2.9.0-cdh5.12.1 (5131a03) built on Thu Aug 24 09:27:32 PDT 2017)

When pretty-printing is disabled, you can use the '--output_delimiter' flag to set
the delimiter for fields in the same row. The default is ','.
***********************************************************************************
[node1.tanu.com:21000] > show databases;
Query: show databases
+------------------+----------------------------------------------+
| name             | comment                                      |
+------------------+----------------------------------------------+
| _impala_builtins | System database for Impala builtin functions |
| default          | Default Hive database                        |
| tanu             |                                              |
+------------------+----------------------------------------------+
Fetched 3 row(s) in 0.04s





OpenSSL CA Authority setup with SAN certificate for Cloudera

As i mentioned in my previous blog Secure hadoop/cloudera cluster using PKI implementation setting up PKI will be good practice, In that blog i didn't mentioned about SAN certificate which will be more important while setting up fronted load balancer for hive/impala services in a multi node cluster.


I have created simple shell script with input file, this will create Single ROOT CA keystore and certificate and it will create jks/pem formate keystore and truststore for hive/impala/cm/hue


Certificate input file

#LBNAME:NODES:FORMAT(PEM|JKS)
hive.tanu.com:nm1.tanu.com,node1.tanu.com,node2.tanu.com,hive.tanu.com:jks
hue.tanu.com:nm1.tanu.com,node1.tanu.com,node2.tanu.com,hue.tanu.com:pem
impala.tanu.com:nm1.tanu.com,node1.tanu.com,node2.tanu.com,impala.tanu.com:jks
cm.tanu.com:nm1.tanu.com,node1.tanu.com,node2.tanu.com,cm.tanu.com:jks



ROOT CA creation script:

root@ubuntu:/mnt/pool/usb-disk/X509CA/Project-cloudera1# cat Create_ROOTCA.sh
openssl genrsa -des3 -out MyRootCA.key 2048
openssl req -x509 -new -nodes -key MyRootCA.key -sha256 -days 5000 -out MyRootCA.pem -subj "/C=US/ST=NY/L=NYC/O=Global Security/OU=IT Department/CN=Hadoop CA Authority"


SAN Certificate Creating script JKS FORMAT

nodes=`cat certificates.txt|grep -i ":jks" |grep -v "^#" |cut -d":" -f2`
#array=( $san )
#echo "Number of elements: ${#array[@]}"
#count=1
export PATH=$PATH:/opt/jdk1.8.0_144/bin
for cnname in `cat certificates.txt|grep -i ":jks"|grep -v "^#"`
        do
                unset san
                cn=$(echo $cnname|cut -d":" -f1)
                count=1
                echo "[ req ]">openssl-ext.cnf
                echo "req_extensions   = v3_req" >>openssl-ext.cnf
                echo "[ v3_req ]" >>openssl-ext.cnf
                echo "subjectAltName = @alt_names" >>openssl-ext.cnf
                echo "[alt_names]" >>openssl-ext.cnf
                for x in $(echo $cnname|cut -d":" -f2|tr "," "\n")
                        do
                        san+="dns:$x,"
                        echo "DNS.$count = $x">>openssl-ext.cnf
                        count=$((count + 1 ))
                done
                echo $san
                mkdir $cn
                keytool -genkey -alias $cn -keyalg RSA -keystore $cn/$cn.jks -keysize 2048 -dname "CN=$cn,OU=IT,O=IT,L=NYC,S=NY,C=US" -storepass cloud123 -keypass cloud123
                keytool -certreq -alias $cn -keystore $cn/$cn.jks -file $cn/$cn.csr -ext SAN="$san" -storepass cloud123 -keypass cloud123
                openssl x509 -req -in $cn/$cn.csr -CA MyRootCA.pem -CAkey MyRootCA.key -CAcreateserial -out $cn/$cn.pem -days 5000  -sha256 -extfile openssl-ext.cnf -extensions v3_req -passin pass:sathish123
                echo "Importing CA Cetificate into clinet key"
                keytool -importcert -trustcacerts -keystore $cn/$cn.jks -alias MyRootCA.pem -file MyRootCA.pem -storepass cloud123 -keypass cloud123 -noprompt
                echo "Importing Certiicate into client keystore"
                keytool -importcert -trustcacerts -keystore $cn/$cn.jks -alias $cn -file $cn/$cn.pem -storepass cloud123 -keypass cloud123 -noprompt
                echo "Creating Truststore and import CA ceertificate"
                keytool -importcert -keystore $cn/$cn.truststore -alias MyRootCA -file MyRootCA.pem -storepass cloud123 -keypass cloud123 -noprompt



        done

SAN Certificate Creating script PEM FORMAT

root@ubuntu:/mnt/pool/usb-disk/X509CA/Project-cloudera1# cat pem__cert_create.sh
export PATH=$PATH:/opt/jdk1.8.0_144/bin
for cnname in `cat certificates.txt|grep -v "^#"`
        do
                unset san
                cn=$(echo $cnname|cut -d":" -f1)
                count=1
                echo "[ req ]" >openssl-ext.cnf
                echo "distinguished_name = req_distinguished_name" >>openssl-ext.cnf
                echo "req_extensions   = v3_req" >>openssl-ext.cnf
                echo "prompt = no" >>openssl-ext.cnf
                echo "[req_distinguished_name]" >>openssl-ext.cnf
                echo "C = US" >>openssl-ext.cnf
                echo "ST = NY" >>openssl-ext.cnf
                echo "L = NYC" >>openssl-ext.cnf
                echo "O = Global Security" >>openssl-ext.cnf
                echo "OU = IT Department" >>openssl-ext.cnf
                echo "CN = $cn" >>openssl-ext.cnf
                echo "[ v3_req ]" >>openssl-ext.cnf
                echo "subjectAltName = @alt_names" >>openssl-ext.cnf
                echo "[alt_names]" >>openssl-ext.cnf
                for x in $(echo $cnname|cut -d":" -f2|tr "," "\n")
                        do
                        san+="dns:$x,"
                        echo "DNS.$count = $x">>openssl-ext.cnf
                        count=$((count + 1 ))
                        done

                echo $san
                mkdir $cn
                openssl genrsa -des3 -out $cn/$cn.key -passout pass:cloud123 2048
                openssl req -new -out $cn/$cn.csr -key  $cn/$cn.key  -passin pass:cloud123 -config openssl-ext.cnf
                openssl x509 -req -in $cn/$cn.csr -CA MyRootCA.pem -CAkey MyRootCA.key -CAcreateserial -out $cn/$cn.pem -days 5000  -sha256 -extfile openssl-ext.cnf -extensions v3_req -passin pass:sathish123

        done

Cloudera director installation on AWS

Our client wanted to setup the cloudera data science workbench in on-prem however existing infrastructure is don't have support for  redhat 7.2 build which is main prerequisite for cds. due to that our client decided to move the  cloudera manager and data science workbench setup in AWS cloud.

I will post another blog on how to install the cloudera manager using CDS and cloudera data science workbench in AWS. In this blog i will share on , how we setup the cloudera director in AWS cloud.

Cloudera director is nothing but provide UI and command line interface to dynamically create Cloudera manager  environments and spin/scale cloudera clusters dynamically in Amazon Web Services (AWS), Google Cloud Platform (GCP), and Microsoft Azure by just providing how many instances we want for master/worker and gateway.

we have already crated VPC/ security groups/subnets/redhat 7.2 instance for cloudera director in aws.

Cloudera director installation:

1.Install the jdk1.8

2.install the wget packages if wget command is not working

3.create a  Cloudera Director repository by running this command :cd /etc/yum.repos.d/
sudo wget "http://archive.cloudera.com/director/redhat/7/x86_64/director/cloudera-director.repo"

4.Install Cloudera Director server and client by running the following command:sudo yum install cloudera-director-server cloudera-director-client

5.Start the Cloudera Director server by running the following command:sudo service cloudera-director-server start

6.If the RHEL 7 or CentOS firewall is running on the EC2 instance where you have installed Cloudera Director, disable and stop the firewall with the following commands:sudo systemctl disable firewalld
sudo systemctl stop firewalld


Once the server is started, it will fail by default, because the backend metastore needs to be configured.

Edit the property file under /etc/cloudera-director-server/application.properties






Tuesday, October 17, 2017

Setting up Load balancer for HIVE in secure cluster with TLS support


Recently i  got requirement from business to enable high availability with front end load balancer for hive and Impala services.

There are couple are issues i faced during the hive ha implementation in the kerberos enabled cluster with TLS.

1) one was load balancer host name (.i.e hive.tanu.com) was missing in hive.keytab, due to that beeline throwed below exception.

2) second one was each hive instance have own certificate with the  common name of respective host name like cn=node1.tanu.com,cn=node2.tanu.com ..etc , when we use Load balancer host name i.e hive.tanu.com , beeline started throwing an exception ssl name not matching exception

hive@nm1 ~]$ beeline -u 'jdbc:hive2://hive.tanu.com:9009/default;principal=hive/_HOST@TANU.COM;ssl=true;sslTrustStore=/opt/hivekeystore/truststore.ts;trustStorePassword=sathish123;'
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512M; support was removed in 8.0
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512M; support was removed in 8.0
scan complete in 2ms
Connecting to jdbc:hive2://hive.tanu.com:9009/default;principal=hive/_HOST@TANU.COM;ssl=true;sslTrustStore=/opt/hivekeystore/truststore.ts;trustStorePassword=sathish123;
17/10/17 21:53:33 [main]: ERROR transport.TSaslTransport: SASL negotiation failure
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7))]

[root@nm1 350-hive-HIVESERVER2]# klist -k -t -K hive.keytab
Keytab name: FILE:hive.keytab
KVNO Timestamp           Principal
---- ------------------- ------------------------------------------------------
   1 10/07/2017 20:24:24 hive/nm1.tanu.com@TANU.COM (0x370093cbe6049b97e2f5fd608e24d534)
   1 10/07/2017 20:24:24 HTTP/nm1.tanu.com@TANU.COM (0xb2f42608e82b6b60e4b540084d8165ce)

To overcome all above issue, 

first we need to add the load balancer hostname in cloudera --> cluster --> hive --> configuration --> search load balancer --> and update the load balancer hostname and port



then go to administration--> security --> kerberos Credentials --> Generate missing Credentials.

Once you able to complete above steps, you can see the hive lb hostname in hive.keytab


[root@nm1 369-hive-HIVESERVER2]# klist -k -t -K hive.keytab 
Keytab name: FILE:hive.keytab
KVNO Timestamp           Principal
---- ------------------- ------------------------------------------------------
   1 10/17/2017 22:24:55 HTTP/nm1.tanu.com@TANU.COM (0xb2f42608e82b6b60e4b540084d8165ce)
   1 10/17/2017 22:24:55 hive/hive.tanu.com@TANU.COM (0xc9499ade1c6b310252b661ed3096f548)
   1 10/17/2017 22:24:55 hive/nm1.tanu.com@TANU.COM (0x370093cbe6049b97e2f5fd608e24d534)


And for the SSL cn  miss match, we need to create the SAN certificate and include all the hive host names in the SAN field.




follow this blog post  OpenSSL CA Authority setup with SAN certificate for Clouderafor how to create SAN certificate for hive and refer cloudera document fo enabling TLS support.

Enabling hive TLS support, refer the cloudera document, it will be pretty simple steps.




enable debug logging for HDFS in bash

Recently i started facing issue in executing hdfs command in my Linux server.

After i did google search i end up with below environment variable.



export HADOOP_ROOT_LOGGER=DEBUG,console

export HADOOP_DATANODE_OPTS=”${HADOOP_DATANODE_OPTS} -Dhadoop.root.logger=DEBUG,DRFA”

export HADOOP_NAMENODE_OPTS="${HADOOP_NAMENODE_OPTS} -Dhadoop.root.logger=DEBUG,DRFA" 


i was able to debug with first environment variable itself.


[hdfs@datanode1:[PTA] ~]$ hdfs dfs -ls /
17/10/17 15:24:08 DEBUG util.Shell: setsid exited with exit code 0
17/10/17 15:24:08 DEBUG conf.Configuration: parsing URL jar:file:/opt/cloudera/parcels/CDH-5.8.4-1.cdh5.8.4.p0.5/jars/hadoop-common-2.6.0-cdh5.8.4.jar!/core-default.xml
17/10/17 15:24:08 DEBUG conf.Configuration: parsing input stream sun.net.www.protocol.jar.JarURLConnection$JarURLInputStream@25c6ca49
17/10/17 15:24:08 DEBUG conf.Configuration: parsing URL file:/etc/hadoop/conf.cloudera.yarn/core-site.xml
17/10/17 15:24:08 DEBUG conf.Configuration: parsing input stream java.io.BufferedInputStream@1412504b
17/10/17 15:24:08 DEBUG core.Tracer: sampler.classes = ; loaded no samplers
17/10/17 15:24:08 DEBUG core.Tracer: span.receiver.classes = ; loaded no span receivers
17/10/17 15:24:08 DEBUG lib.MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[Rate of successful kerberos logins and latency (milliseconds)], about=, type=DEFAULT, always=false, sampleName=Ops)
17/10/17 15:24:08 DEBUG lib.MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[Rate of failed kerberos logins and latency (milliseconds)], about=, type=DEFAULT, always=false, sampleName=Ops)
17/10/17 15:24:08 DEBUG lib.MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.getGroups with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[GetGroups], about=, type=DEFAULT, always=false, sampleName=Ops)
17/10/17 15:24:08 DEBUG lib.MutableMetricsFactory: field private org.apache.hadoop.metrics2.lib.MutableGaugeLong org.apache.hadoop.security.UserGroupInformation$UgiMetrics.renewalFailuresTotal with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[Renewal failures since startup], about=, type=DEFAULT, always=false, sampleName=Ops)
17/10/17 15:24:08 DEBUG lib.MutableMetricsFactory: field private org.apache.hadoop.metrics2.lib.MutableGaugeInt org.apache.hadoop.security.UserGroupInformation$UgiMetrics.renewalFailures with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[Renewal failures since last successful login], about=, type=DEFAULT, always=false, sampleName=Ops)
17/10/17 15:24:08 DEBUG impl.MetricsSystemImpl: UgiMetrics, User and group related metrics
17/10/17 15:24:09 DEBUG security.SecurityUtil: Setting hadoop.security.token.service.use_ip to true
17/10/17 15:24:09 DEBUG security.Groups:  Creating new Groups object
17/10/17 15:24:09 DEBUG security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000; warningDeltaMs=5000
17/10/17 15:24:09 DEBUG security.UserGroupInformation: hadoop login
17/10/17 15:24:09 DEBUG security.UserGroupInformation: hadoop login commit
17/10/17 15:24:09 DEBUG security.UserGroupInformation: using local user:UnixPrincipal: hdfs
17/10/17 15:24:09 DEBUG security.UserGroupInformation: Using user: "UnixPrincipal: hdfs" with name hdfs
17/10/17 15:24:09 DEBUG security.UserGroupInformation: User entry: "hdfs"
17/10/17 15:24:09 DEBUG security.UserGroupInformation: Assuming keytab is managed externally since logged in from subject.
17/10/17 15:24:09 DEBUG security.UserGroupInformation: UGI loginUser:hdfs (auth:SIMPLE)
17/10/17 15:24:09 DEBUG core.Tracer: sampler.classes = ; loaded no samplers
17/10/17 15:24:09 DEBUG core.Tracer: span.receiver.classes = ; loaded no span receivers
17/10/17 15:24:09 DEBUG hdfs.BlockReaderLocal: dfs.client.use.legacy.blockreader.local = false
17/10/17 15:24:09 DEBUG hdfs.BlockReaderLocal: dfs.client.read.shortcircuit = false
17/10/17 15:24:09 DEBUG hdfs.BlockReaderLocal: dfs.client.domain.socket.data.traffic = false
17/10/17 15:24:09 DEBUG hdfs.BlockReaderLocal: dfs.domain.socket.path = /var/run/hdfs-sockets/dn
17/10/17 15:24:09 DEBUG retry.RetryUtils: multipleLinearRandomRetry = null
17/10/17 15:24:09 DEBUG ipc.Server: rpcKind=RPC_PROTOCOL_BUFFER, rpcRequestWrapperClass=class org.apache.hadoop.ipc.ProtobufRpcEngine$RpcRequestWrapper, rpcInvoker=org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker@62e98ccf
17/10/17 15:24:09 DEBUG ipc.Client: getting client out of cache: org.apache.hadoop.ipc.Client@38d1b194
17/10/17 15:24:09 DEBUG util.NativeCodeLoader: Trying to load the custom-built native-hadoop library...
17/10/17 15:24:09 DEBUG util.NativeCodeLoader: Loaded the native-hadoop library
17/10/17 15:24:09 DEBUG unix.DomainSocketWatcher: org.apache.hadoop.net.unix.DomainSocketWatcher$2@350599cc: starting with interruptCheckPeriodMs = 60000
17/10/17 15:24:09 DEBUG util.PerformanceAdvisory: Both short-circuit local reads and UNIX domain socket are disabled.
17/10/17 15:24:09 DEBUG sasl.DataTransferSaslUtil: DataTransferProtocol not using SaslPropertiesResolver, no QOP found in configuration for dfs.data.transfer.protection
17/10/17 15:24:09 DEBUG ipc.Client: The ping interval is 60000 ms.
17/10/17 15:24:09 DEBUG ipc.Client: Connecting to namenode.tanu.com/169.35.91.237:8020
17/10/17 15:24:09 DEBUG ipc.Client: IPC Client (16613795) connection to namenode.tanu.com/169.35.91.237:8020 from hdfs: starting, having connections 1
17/10/17 15:24:09 DEBUG ipc.Client: IPC Client (16613795) connection to namenode.tanu.com/169.35.91.237:8020 from hdfs sending #0
17/10/17 15:24:09 DEBUG ipc.Client: IPC Client (16613795) connection to namenode.tanu.com/169.35.91.237:8020 from hdfs got value #0
17/10/17 15:24:09 DEBUG ipc.ProtobufRpcEngine: Call: getFileInfo took 182ms
17/10/17 15:24:10 DEBUG ipc.Client: IPC Client (16613795) connection to namenode.tanu.com/169.35.91.237:8020 from hdfs sending #1
17/10/17 15:24:10 DEBUG ipc.Client: IPC Client (16613795) connection to namenode.tanu.com/169.35.91.237:8020 from hdfs got value #1
17/10/17 15:24:10 DEBUG ipc.ProtobufRpcEngine: Call: getListing took 2ms
Found 7 items
drwxr-x--x   - accumulo accumulo            0 2015-09-30 10:35 /accumulo
drwxr-xr-x   - hbase    hbase               0 2017-09-27 01:53 /hbase
drwxrwxr-x   - solr     solr                0 2017-10-05 06:55 /solr
drwxr-xr-x   - hdfs     supergroup          0 2017-05-02 09:31 /system
drwxrwxrwx   - hdfs     supergroup          0 2017-10-17 15:00 /tmp
drwxrwxrwx   - hdfs     supergroup          0 2017-10-12 12:13 /user
17/10/17 15:24:10 DEBUG ipc.Client: stopping client from cache: org.apache.hadoop.ipc.Client@38d1b194
17/10/17 15:24:10 DEBUG ipc.Client: removing client from cache: org.apache.hadoop.ipc.Client@38d1b194
17/10/17 15:24:10 DEBUG ipc.Client: stopping actual client because no more references remain: org.apache.hadoop.ipc.Client@38d1b194
17/10/17 15:24:10 DEBUG ipc.Client: Stopping client
17/10/17 15:24:10 DEBUG ipc.Client: IPC Client (16613795) connection to namenode.tanu.com/169.35.91.237:8020 from hdfs: closed

17/10/17 15:24:10 DEBUG ipc.Client: IPC Client (16613795) connection to namenode.tanu.com/169.35.91.237:8020 from hdfs: stopped, remaining connections 0

Saturday, October 14, 2017

Enabling SPNEGO authentication for Hue and Browser agent settings

After we secured cluster with kerberos authentication we also configured SPNEGO authentication backend for hue ,So that all Active directory authenticated users can access the Hue console directly.

Enabling SPNEGO authentication is straight forward just following cloudera documentation it will work.

But for us it didn't work, it was prompting for basic authentication and in hue logs we got below exception.

After we googled we found some suggestion to add the websites into the Internet explore Intranet zone or in Trusted Sites zone.

Once we added the site into Trusted sites, we were able to access hue console without providing username and password.

----------------------------------------------------------------------------

[13/Oct/2017 19:44:56 -0700] middleware   ERROR    Unexpected error when authenticating against KDC
Traceback (most recent call last):
  File "/opt/cloudera/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/lib/hue/desktop/core/src/desktop/middleware.py", line 598, in process_request
    r=kerberos.authGSSServerStep(context,authstr)
GSSError: (('Unspecified GSS failure.  Minor code may provide more information', 851968), ('Ticket not yet valid', -1765328351))
[13/Oct/2017 19:44:56 -0700] access       DEBUG    192.168.100.10 -anon- - "HEAD /desktop/debug/is_alive HTTP/1.1"

Friday, October 13, 2017

Secure hadoop/cloudera cluster using PKI implementation

Setting Up SSL/TLS for cloudera Manager and the components using self signed certificate is little painful.

Since  cloudera or  hadoop infrastructure will have more nodes and creating self signed certificate and maintaining trust store across the nodes will be more complex.


In my experience implementing single root CA Authority for entire cluster would be good practice.  So that we need not to have all the certificates in trust store other than CA certificate for TLS communication.

Here is the simple shell script to create

ROOT CA certificate
Create/issue certificate and sign PEM format certificate for nodes
Create/issue certificate and sign JKS format certificate for cloudera manger and hive

--------------------------------- Create root ca --------------------------------------

root@ubuntu:/mnt/pool/usb-disk/X509CA# cat Create_ROOTCA.sh
openssl genrsa -des3 -out MyRootCA.key 2048
openssl req -x509 -new -nodes -key MyRootCA.key -sha256 -days 5000 -out MyRootCA.pem -subj "/C=US/ST=NY/L=NYC/O=Global Security/OU=IT Department/CN=Hadoop CA Authority"

--------------- Create Certificate in PEM Format --------------------------

root@ubuntu:/mnt/pool/usb-disk/X509CA# cat Createcert_pem_format.sh
echo "Enter the client comman Name : "
read cn
mkdir $cn
openssl genrsa -des3 -out $cn/$cn.key 2048
openssl req -new -key $cn/$cn.key -out $cn/$cn.csr -subj "/C=US/ST=NY/L=NYC/O=Global Security/OU=IT Department/CN=$cn"
openssl x509 -req -in $cn/$cn.csr -CA MyRootCA.pem -CAkey MyRootCA.key -CAcreateserial -out $cn/$cn.pem -days 5000  -sha256
cp MyRootCA.pem $cn/
#openssl req -x509 -new -nodes -key MyRootCA.key -sha256 -days 5000 -out MyRootCA.pem -subj "/C=US/ST=NY/L=NYC/O=Global Security/OU=IT Department/CN=Hadoop CA Authority"

---------------- Create keystore and Truststore Certificate in JKS Format ---------------------------------------
root@ubuntu:/mnt/pool/usb-disk/X509CA# cat Createcert_jks_format.sh 
export PATH=$PATH:/opt/jdk1.8.0_144/bin
echo "Enter th Comman Name :"
read cn
mkdir $cn
keytool -genkey -alias $cn -keyalg RSA -keystore $cn/$cn.jks -keysize 2048 -dname "CN=$cn,OU=IT,O=IT,L=NYC,S=NY,C=US" 
keytool -certreq -alias $cn -keystore $cn/$cn.jks -file $cn/$cn.csr
openssl x509 -req -in $cn/$cn.csr -CA MyRootCA.pem -CAkey MyRootCA.key -CAcreateserial -out $cn/$cn.pem -days 5000  -sha256
keytool -importcert -trustcacerts -keystore $cn/$cn.jks -alias MyRootCA.pem -file MyRootCA.pem
keytool -importcert -trustcacerts -keystore $cn/$cn.jks -alias $cn -file $cn/$cn.pem
keytool -importcert -keystore $cn/$cn.truststore -alias MyRootCA -file MyRootCA.pem



Wednesday, October 11, 2017

Hadoop distcp between secure and non-secure cluster |Hive table manual replication

I was trying to transfer large hive table from one of our non-secure cluster into kerberos enabled secure cluster.

since i dont have more temporary space on source server,  i was searching some hadoop in built command to support direct file transfer between secure and non-secure.

if i run below command from the secure cluster it will allow the simple auth to connect the non-secure cluster

hdfs dfs -D ipc.client.fallback-to-simple-auth-allowed=true -copyToLocal hdfs://xxx.tanu.com:8020/user/hive/warehouse/tanu.db/tanu_info /user/hive/warehouse/tanu.db/

hdfs dfs -copyFromLocal tanu_info /user/hive/warehouse/tanu.db/

once you import the hive table, run below query in hue hive editor, it wiill rebuild the table from the loaded data.

run below sql query on source hive editor to get the table create statement

show create table tanu_info

then create the tanu_ino table on destination using create statement

once you import the hive data , run below query in hue hive editor, it will rebuild the table from the loaded data.

msck repair table tanu_info

Sunday, October 8, 2017

issues i faced during cloudera 5.12.1 kerberos setup

------------------------------------------------------------------------------------
Make sure default_ccache_name commented in krb5.conf otherwise beeline bydefault will look /tmp/krb5cc* file in /tmp directory

includedir /etc/krb5.conf.d/

[logging]
 default = FILE:/var/log/krb5libs.log
 kdc = FILE:/var/log/krb5kdc.log
 admin_server = FILE:/var/log/kadmind.log

[libdefaults]
 dns_lookup_realm = false
 ticket_lifetime = 24h
 renew_lifetime = 7d
 forwardable = true
 rdns = false
 default_realm = TANU.COM
# default_ccache_name = KEYRING:persistent:%{uid}

[realms]
 TANUE.COM = {
  kdc = winad.tanu.com
  admin_server = winad.tanu.com
 }

[domain_realm]
 .tanu.com = TANU.COM
 tanu.com = TANU.COM


--------------------------------------------------------------------------------------------
To Debug Cloudera beeline kerberos or any java ssl  related issues can add JVM arguments in blow variables
--------------------------------------------------------------------------------------------

export HADOOP_CLIENT_OPTS="-Dsun.security.krb5.debug=true


-----------------------------------------------------
if you face any issues in agent UID related and want to reattach
-------------------------------------------------------------

stop cloudera-agent
remove the agent from the clouder manager
delete this file /var/lib/cloudera-scm-agent/uuid on agent server
then start the agent
------------------------------------------------------------
setup the SSL/TLS for hive -->  Hive support jks format keystore and truststore
-----------------------------------------------------------------------------

Make sure All the certificate Comman Name should match the hostname (i.e CN=node1.tanu.com)

to create keystore

keytool -genkey -alias hivecert -keyalg RSA -keystore keystore.jks

to create truststore

keytool -export -alias hivecert -file hivecert.cer -keystore keystore.jks

keytool -import -v -trustcacerts -alias hivecert -file hivecert.cer -keystore truststore.ts

add the hue certificate in trustore

first convert hue pem format certificate into der format

openssl x509 -inform der -in hivecert.cer -out hivecert.pem
keytool -import -alias hueserver -keystore truststore.ts -file huecertificate.der

-------------------------------------------------------------------------------------
setup the Hue SSL/TLS --> Hue Support PEM format certificate store
-------------------------------------------------------------------------

To create keystore

openssl req -x509 -newkey rsa:4096 -keyout huekey.pem -out huecert.pem -days 3650

to create truststore or CA bundle

cp huecert.pem huecerttrust.pem

conver the hive der certificate format into pem

openssl x509 -inform der -in hivecert.cer -out hivecert.pem
cat hivecert.pem >>huecerttrust.pem