Hadoop Knowledge base share

Friday, February 7, 2020

Cloudera HIVE BDR schedule via Python script

Worked on new user case or may be to reduce my workload i force to write this script. replicating all the hive databases from production to DR cluster is very slow taking one day to complete.

thought of splitting 100 databases into multiple batches like 10 on each hive bdr configuration. i felt creating 10 batches and updating configuration of more of painful work. so wrote below script to do that,

Tuesday, February 4, 2020

Python Script to Monitor and send email Cloudera Services and Roles health status

I know cloudera has in built feature to send the cluster status via email. but in our environment mail function is not working through cloudera for some reason. tried all the configuration but no luck.

We could not contact Mail server support team to resolve issue. fortunately mail command is working from Unix server. so thought of writing own monitoring script using cloudera cm_client python module and send the alert to support group.

Try below my script

Monday, February 3, 2020

Sorting Yarn running jobs using python

Although hadoop Yarn resource manager provide nice UI interface to sorting yarn jobs, sometimes it is difficult to filter the ideal session with custom sorting.

Below my script will sort the ideal session running for more than hours (3600000) and print with the application ID.

we can manually kill the session using yarn command or we could also automate in the same script.

App Name: dev-claim_report1
Application id: application_1580724162250_1127
Total epsed time: 3.0 hours
queue Name: root.devqueue-1
Allocated memory: 1651 gb
('Tracking Url: ', u'http://devmn-02.tanu.com:8088/proxy/application_1580724162250_1127/')
No long running jobs!

App Name: Spark shell
Application id: application_1580724162250_1151
Total epsed time: 1.0 hours
queue Name: root.devqueue-2
Allocated memory: 55 gb
('Tracking Url: ', u'http://devmn-02.tanu.com:8088/proxy/application_1580724162250_1151/')

App Name: Spark shell
Application id: application_1580724162250_1152
Total epsed time: 1.0 hours
queue Name: root.devqueue-3
Allocated memory: 55 gb
('Tracking Url: ', u'http://devmn-02.tanu.com:8088/proxy/application_1580724162250_1152/')

App Name: dev-claim_report2
Application id: application_1580724162250_1141
Total epsed time: 1.0 hours
queue Name: root.devqueue-4
Allocated memory: 1 gb
('Tracking Url: ', u'http://devmn-02.tanu.com:8088/proxy/application_1580724162250_1141/')

Kudu tablet servers Metric check using python

Recently got chance to work on Kudu issue. developers started getting below error. after going through lot of documents i come to know that, cluster did not plan according to the kudu recommendation.

dropped due to backpressure. The service queue is full; it has 50 items

Below is kudu recommendation.

Scale

Recommended maximum number of tablet servers is 100.
Recommended maximum number of masters is 3.
Recommended maximum amount of stored data, post-replication and post-compression, per tablet server is 8TB.
Recommended maximum number of tablets per tablet server is 2000, post-replication.
Maximum number of tablets per table for each tablet server is 60, post-replication, at table-creation time.

I need to find the numbers of tablets per server to developer, so that they can cleanup the tables or reduce the partition to meet the recommendation,

Below is my python script which will connect tablet server metrics and print the details

+------------------------------------------+-----------------+---------------+
| Server | Running tablets | Total_tablets |
+------------------------------------------+-----------------+---------------+
| devkn-01.tanu.com:8050 | 1790 | 11235 |
+------------------------------------------+-----------------+---------------+
| devkn-02.tanu.com:8050 | 1787 | 10970 |
+------------------------------------------+-----------------+---------------+
| devkn-03.tanu.com:8050 | 1924 | 11349 |
+------------------------------------------+-----------------+---------------+
| devkn-04.tanu.com:8050 | 1923 | 11325 |
+------------------------------------------+-----------------+---------------+
| devkn-05.tanu.com:8050 | 1838 | 11297 |
+------------------------------------------+-----------------+---------------+
| devkn-06.tanu.com:8050 | 1924 | 11299 |
+------------------------------------------+-----------------+---------------+
| devkn-07.tanu.com:8050 | 1788 | 11050 |
+------------------------------------------+-----------------+---------------+
| devkn-08.tanu.com:8050 | 1790 | 10564 |
+------------------------------------------+-----------------+---------------+
| devkn-09.tanu.com:8050 | 1921 | 10758 |
+------------------------------------------+-----------------+---------------+
| devkn-10.tanu.com:8050 | 1923 | 10899 |
+------------------------------------------+-----------------+---------------+
| devkn-11.tanu.com:8050 | 1868 | 9254 |
+------------------------------------------+-----------------+---------------+
| devkn-12.tanu.com:8050 | 2269 | 8101 |
+------------------------------------------+-----------------+---------------+
| devkn-13.tanu.com:8050 | 1802 | 10467 |
+------------------------------------------+-----------------+---------------+
| devkn-14.tanu.com:8050 | 1927 | 10875 |
+------------------------------------------+-----------------+---------------+
| devkn-15.tanu.com:8050 | 1601 | 10867 |
+------------------------------------------+-----------------+---------------+
| devkn-16.tanu.com:8050 | 2017 | 10088 |
+------------------------------------------+-----------------+---------------+
| devkn-17.tanu.com:8050 | 1793 | 10391 |
+------------------------------------------+-----------------+---------------+
| devkn-18.tanu.com:8050 | 1683 | 11631 |
+------------------------------------------+-----------------+---------------+
| devkn-19.tanu.com:8050 | 1946 | 9793 |
+------------------------------------------+-----------------+---------------+
| devkn-20.tanu.com:8050 | 1719 | 10488 |
+------------------------------------------+-----------------+---------------+
| devkn-21.tanu.com:8050 | 1703 | 9213 |
+------------------------------------------+-----------------+---------------+
| devkn-22.tanu.com:8050 | 1740 | 9920 |
+------------------------------------------+-----------------+---------------+
| devkn-23.tanu.com:8050 | 1827 | 9953 |
+------------------------------------------+-----------------+---------------+
| devkn-24.tanu.com:8050 | 1929 | 10094 |
+------------------------------------------+-----------------+---------------+

Friday, January 3, 2020

Cloudera cluster creation on google compute instance

Wanted to quickly launch my cloudera cluster in google cloud(since i had some free credit wanted to try effectively) similar to on prem cluster like single sign on all the linux nodes.

initially i though of integrate all linux servers with Active directory + SSSD client but later i moved to MIT kerberos + Open LDAP client + SASL passthrough.

This script has 2 part

PART 1: will create No. of gcp instances, create hadoop users/groups, install SASL/openldap/MIT kerberos/Cloudera agent and Manager

PART 2: Will add the hosts into cloudera manager,create cluster/add hdfs and zookeeper services.(still working on adding more services)

Part 2 of this script can be easily scale up with any cloud providers(AWS,AZURE) as long as cloudera manager url is exposed to internet.

Saturday, November 23, 2019

google cloud compute instance creation using python script

First steps to create dedicated service account for our python script with name libcloud( since we are going to use apache libcloud python framework )

And then map the necessary roles to the service account to create compute instances

After mapping above roles, i was not able to create instances in my python scripts, it was keep on throwing below exception

response = responseCls(**kwargs)
File "/home/sathish/miniconda3/lib/python3.7/site-packages/libcloud/common/base.py", line 154, in __init__
self.object = self.parse_body()
File "/home/sathish/miniconda3/lib/python3.7/site-packages/libcloud/common/google.py", line 267, in parse_body
raise GoogleBaseError(message, self.status, code)
libcloud.common.google.GoogleBaseError: "The user does not have access to service account '123333333333-compute@developer.gserviceaccount.com'. User: 'libcloud@xxxxxxxxx.iam.gserviceaccount.com'. Ask a project owner to grant you the iam.serviceAccountUser role on the service account"

Then i granted additional below roles.

then try below python script to create instance

       
from libcloud.compute.types import Provider
from libcloud.compute.providers import get_driver

ComputeEngine = get_driver(Provider.GCE)
# Note that the 'PEM file' argument can either be the JSON format or
# the P12 format.
driver = ComputeEngine('libcloud@xxxxx.iam.gserviceaccount.com','/home/sathish/gcp_pem.json',
                       project='ferrous-weaver-xxxxx')

#(driver.list_images())

### Function to findout the gcp image name to provide arg in create instance function ###

def list_all_gcp_images(driver):
        images = driver.list_images()
        for image in images:
                print(image)

### use below function to create compute instance ##

def create_instance(driver):
        s = 'n1-standard-1'
        i = 'centos-7-v20191121'
        z = 'us-central1-a'

        sa_scopes = [{'email': 'default','scopes': ['storage-ro']}]
        node_1 = driver.create_node("n2", s, i, z, ex_service_accounts=sa_scopes)

create_instance(driver)
list_all_gcp_images(driver)

Thursday, October 17, 2019

cloudera manager TLS via python api

import socket
from cm_api.api_client import ApiResource
from cm_api.api_client import ApiException
from cm_api.endpoints.cms import ClouderaManager
import ssl

#CM_HOST = "cm.tanu.com"
CM_HOST = "cm.tanu.com"
#api = ApiResource(CM_HOST,version=13, username="admin", password="admin")
context = ssl.SSLContext(ssl.PROTOCOL_TLSv1)
cxt = ssl.create_default_context(cafile="/app/ca/ca.pem")

api = ApiResource(CM_HOST,version=12, username="admin", password="admin",use_tls=True,ssl_context=cxt)

clu=api.get_cluster('Cluster 1')
hdfs=clu.get_service('hdfs')

#hdfs_ssl_enable = { 'hdfs_hadoop_ssl_enabled' : 'true','ssl_server_keystore_location' : '/var/tmp/cm.jks','ssl_server_keystore_password':'test123','ssl_server_keystore_keypassword':'test123' }
cm_ssl_conf = {'WEB_TLS':'true','KEYSTORE_PATH':'/opt/cloudera-manager/ssl/jks/javakeystore.jks','KEYSTORE_PASSWORD':'iCpjC"7]','TRUSTSTORE_PATH':'/opt/cloudera-manager/ssl/jks/ca_combined.jks','TRUSTSTORE_PASSWORD':'test123'}

#hdfs.update_config(svc_config=hdfs_ssl_enable)
#for name,config in hdfs.get_config(view="full")[0].items():
# print "%s - %s - %s" %(name,config.relatedName,config.description)
# print "%s --> %s" %(name,config.relatedName)
x=ClouderaManager(api)
for name,config in x.get_config(view="full").items():
print "%s --> %s" %(name,config)
#print(x.get_config(view="full"))
#x.update_config(cm_ssl_conf)
print(hdfs)
print(clu)

for h in api.get_all_hosts():
print(h.hostname)
print(h.get_config())

print(api)