Friday, November 17, 2017

Cloudera SAML Integration with shibboleth/ Ldap Authentication Handler


IDP Configuration


Download and extract the shibboleth-identity-provider-3.3.2 file in following dir


[tanu@cloudera:[ET] /cs/app/shibboleth-idp]$ ls -lrt /cs/app/shibboleth-idp
total 292
drwxr-x--- 2 tanu tanu   4096 Nov 16 10:26 old-20171116-1025
drwxr-x--- 6 tanu tanu   4096 Nov 16 10:26 dist
drwxr-x--- 2 tanu tanu   4096 Nov 16 10:26 doc
drwxr-x--- 6 tanu tanu   4096 Nov 16 10:26 system
drwxr-x--- 7 tanu tanu   4096 Nov 16 10:26 webapp
drwxr-x--- 2 tanu tanu   4096 Nov 16 10:26 credentials
drwxr-x--- 5 tanu tanu   4096 Nov 16 10:26 edit-webapp
drwxr-x--- 4 tanu tanu   4096 Nov 16 10:26 flows
drwxr-x--- 2 tanu tanu   4096 Nov 16 10:26 messages
drwxr-x--- 4 tanu tanu   4096 Nov 16 10:26 views
drwxr-x--- 2 tanu tanu   4096 Nov 16 10:26 war
-rw-r----- 1 tanu tanu 235520 Nov 16 11:29 conf.org.tar
drwxr-x--- 3 tanu tanu   4096 Nov 16 14:12 bin
drwxr-x--- 2 tanu tanu   4096 Nov 17 12:09 metadata
drwxr-x--- 2 tanu tanu   4096 Nov 17 12:48 logs
drwxr-x--- 6 tanu tanu   4096 Nov 17 13:04 conf

############################################################################

Update the access-control.xml file to allow users to connect to IDP servers


p:allowedRanges="#{ {'127.0.0.1/32', '::1/128','179.30.0.0/25','179.0.0.0/25'} }" />


Edit the /cs/app/shibboleth-idp/conf/ldap.properties file and update the following maked lines. 

And Make sure there are no space at the end of each line otherwise services wont start properly

Ex idp.authn.LDAP.useStartTLS                     = false 

########################################################

idp.authn.LDAP.authenticator                   = adAuthenticator

idp.authn.LDAP.ldapURL                          = ldaps://ad.tanu.com
idp.authn.LDAP.useStartTLS                     = false
idp.authn.LDAP.useSSL                          = true

idp.authn.LDAP.sslConfig                       = certificateTrust
idp.authn.LDAP.trustCertificates                = /var/tmp/ca_cert.pem

idp.authn.LDAP.returnAttributes                 = sAMAccountName,mail,company


idp.authn.LDAP.baseDN                           = DC=users,DC=tanu,DC=net
idp.authn.LDAP.subtreeSearch                   = true
idp.authn.LDAP.userFilter                       = (sAMAccountName={0})
idp.authn.LDAP.bindDN                           = adbinduser@TANU
idp.authn.LDAP.bindDNCredential                 = test123

idp.authn.LDAP.dnFormat                         = %s@TANU

idp.attribute.resolver.LDAP.ldapURL             = %{idp.authn.LDAP.ldapURL}
idp.attribute.resolver.LDAP.connectTimeout      = %{idp.authn.LDAP.connectTimeout:PT3S}
idp.attribute.resolver.LDAP.responseTimeout     = %{idp.authn.LDAP.responseTimeout:PT3S}
idp.attribute.resolver.LDAP.baseDN              = %{idp.authn.LDAP.baseDN:undefined}
idp.attribute.resolver.LDAP.bindDN              = %{idp.authn.LDAP.bindDN:undefined}
idp.attribute.resolver.LDAP.bindDNCredential    = %{idp.authn.LDAP.bindDNCredential:undefined}
idp.attribute.resolver.LDAP.useStartTLS         = %{idp.authn.LDAP.useStartTLS:true}
idp.attribute.resolver.LDAP.trustCertificates   = %{idp.authn.LDAP.trustCertificates:undefined}
idp.attribute.resolver.LDAP.searchFilter        = (sAMAccountName=$resolutionContext.principal)

##############################################################

Move the attribute-resolver.xml attribute-resolver.xml_bkf
cp attribute-resolver-ldap.xml attribute-resolver.xml


    <AttributeDefinition id="sAMAccountName" xsi:type="Simple" sourceAttributeID="sAMAccountName">
        <Dependency ref="myLDAP" />
        <AttributeEncoder xsi:type="SAML1String" name="urn:mace:dir:attribute-def:uid" encodeType="false" />
        <AttributeEncoder xsi:type="SAML2String" name="urn:oid:0.9.2342.19200300.100.1.1" friendlyName="sAMAccountName" encodeType="false" />
    </AttributeDefinition>

    <!--
    In the rest of the world, the email address is the standard identifier,
    despite the problems with that practice. Consider making the EPPN value
    the same as your official email addresses whenever possible.
    -->
    <AttributeDefinition id="mail" xsi:type="Simple" sourceAttributeID="mail">
        <Dependency ref="myLDAP" />
        <AttributeEncoder xsi:type="SAML1String" name="urn:mace:dir:attribute-def:mail" encodeType="false" />
        <AttributeEncoder xsi:type="SAML2String" name="urn:oid:0.9.2342.19200300.100.1.3" friendlyName="mail" encodeType="false" />
    </AttributeDefinition>

    <AttributeDefinition id="ou" xsi:type="Simple" sourceAttributeID="role">
        <Dependency ref="myLDAP" />
        <AttributeEncoder xsi:type="SAML1String" name="urn:mace:dir:attribute-def:ou" encodeType="false" />
        <AttributeEncoder xsi:type="SAML2String" name="urn:oid:2.5.4.11" friendlyName="role" encodeType="false" />
    </AttributeDefinition>

    <DataConnector id="myLDAP" xsi:type="LDAPDirectory"
        ldapURL="%{idp.attribute.resolver.LDAP.ldapURL}"
        baseDN="%{idp.attribute.resolver.LDAP.baseDN}"
        principal="%{idp.attribute.resolver.LDAP.bindDN}"
        principalCredential="%{idp.attribute.resolver.LDAP.bindDNCredential}"
        useStartTLS="%{idp.attribute.resolver.LDAP.useStartTLS:true}"
        connectTimeout="%{idp.attribute.resolver.LDAP.connectTimeout}"
        trustFile="%{idp.attribute.resolver.LDAP.trustCertificates}"
        responseTimeout="%{idp.attribute.resolver.LDAP.responseTimeout}">
        <FilterTemplate>
            <![CDATA[
                %{idp.attribute.resolver.LDAP.searchFilter}
            ]]>
        </FilterTemplate>
  <ConnectionPool
            minPoolSize="%{idp.pool.LDAP.minSize:3}"
            maxPoolSize="%{idp.pool.LDAP.maxSize:10}"
            blockWaitTime="%{idp.pool.LDAP.blockWaitTime:PT3S}"
            validatePeriodically="%{idp.pool.LDAP.validatePeriodically:true}"
            validateTimerPeriod="%{idp.pool.LDAP.validatePeriod:PT5M}"
            expirationTime="%{idp.pool.LDAP.idleTime:PT10M}"
            failFastInitialize="%{idp.pool.LDAP.failFastInitialize:false}" />
    </DataConnector>


############################################################################

Edit the attribute-filter.xml

    <AttributeFilterPolicy id="anyone">
        <PolicyRequirementRule xsi:type="ANY" />
        <AttributeRule attributeID="sAMAccountName">
            <PermitValueRule xsi:type="ANY" />
        </AttributeRule>
        <AttributeRule attributeID="mail">
            <PermitValueRule xsi:type="ANY" />
        </AttributeRule>
        <AttributeRule attributeID="role">
            <PermitValueRule xsi:type="ANY" />
        </AttributeRule>
    </AttributeFilterPolicy>

######################################################################

Update the metadata-providers.xml with cloudera manager SP metadata URL

    <MetadataProvider id="clouderaManager"
                      xsi:type="FileBackedHTTPMetadataProvider"
                      backingFile="%{idp.home}/metadata/clouderalocalCopyFromXYZHTTP.xml"
                      metadataURL="http://cm.tanu.com:7180/saml/metadata">
       <!-- <MetadataFilter xsi:type="RequiredValidUntil" maxValidityInterval="P14D"/> -->
       <MetadataFilter xsi:type="EntityRoleWhiteList">
          <RetainedRole>md:SPSSODescriptor</RetainedRole>
       </MetadataFilter>
    </MetadataProvider>


Cloudera SP Configuration


1) Create SP keystore file
   
keytool -genkeypair -keystore cm-sp.keystore -keyalg RSA -alias node1 -dname "CN=cm.tanu.com,O=Hadoop" -storepass changeme -keypass changeme -validity 365

2) /cs/app/shibboleth-idp/metadata/idp-metadata.xml file into cloudera manager custom path


3) Go to Cloudera Manager -> Administration --> Setting --> External Authentication -> select SAML






Then restart the Manager, it should redirect to IDP server for authentication. 

Monday, November 6, 2017

spark-shell proxy setting for scala programing

Today we tried to parse xml url throguh the spark-shell but we endup with below error

scala> val xml = XML.load("http://static.klipfolio.com/static/klips/saas/example_data/sales.xml")
java.net.UnknownHostException: static.klipfolio.com
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:184)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:589)
        at java.net.Socket.connect(Socket.java:538)
        at sun.net.NetworkClient.doConnect(NetworkClient.java:180)
        at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
        at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)

Although  unix server configured with below environment variable, didn't work for me.

export http_proxy="http://proxy.tanu.com:8080"
export https_proxy="http://proxy.tanu.com:8080"

Then i tried to pass the proxy conf via spark.driver.extraJavaOptions arguments.

spark-shell --conf "spark.driver.extraJavaOptions=-Dhttp.proxyHost=proxy.tanu.com -Dhttp.proxyPort=8080 -Dhttps.proxyHost=proxy.tanu.com:8080 -Dhttps.proxyPort=8080"


scala> import scala.xml.XML
import scala.xml.XML

scala> val xml = XML.load("http://static.klipfolio.com/static/klips/saas/example_data/sales.xml")
xml: scala.xml.Elem =
<root>
        <qtd>
                <area>
                        <name>Sweden</name>
                        <bookings>1080180</bookings>
                        <bookings_q1>323458</bookings_q1>
                        <bookings_q2>245684</bookings_q2>
                        <bookings_q3>260098</bookings_q3>
                        <bookings_q4>250840</bookings_q4>
                        <weighted>1055232</weighted>
                        <trend>/images/resources/indicators/small/ind-circle-green.png</trend>
                        <on_target>/images/resources/indicators/small/ind-check-green.png</on_target>
                </area>
                <area>
                        <name>Norway</name>
                        <bookings>850685</bookings>
                        <bookings_q1>196845</bookings_q1>
                        <bookings_q2>185625</bookings_q2>
                        <bookings_q3>226300</bookings_q3>
                        <bookings_q4>241915</bookings_q4>
                        <weighted>1269685</weighted>
                        <trend>/images/resources/indicators/small/ind-diamond-yellow.png</trend>
                        <on_target>/images/resources/...
scala>

Hue notebook Session 'xxx' not found." (error 404)

After we setup livy server with hue notebook, developers were able to run and save pyspark codes through hue notebook, however couldn't able to rerun the codes after they re-login with new session.


while troubleshooting we found that, hue is deleting the sessions once  users are logged-off

127.0.0.1 - - [06/Nov/2017:16:14:25 +0000] "GET /sessions/3 HTTP/1.1" 200 -
127.0.0.1 - - [06/Nov/2017:16:14:26 +0000] "GET /sessions/3 HTTP/1.1" 200 -
127.0.0.1 - - [06/Nov/2017:16:14:31 +0000] "POST /sessions/3/statements HTTP/1.1" 201 -
127.0.0.1 - - [06/Nov/2017:16:14:31 +0000] "GET /sessions/3/statements/0 HTTP/1.1" 200 -
127.0.0.1 - - [06/Nov/2017:16:14:31 +0000] "GET /sessions/3/log?from=0 HTTP/1.1" 200 -
127.0.0.1 - - [06/Nov/2017:16:25:46 +0000] "DELETE /sessions/3 HTTP/1.1" 200 -

So how to open the previously saved notebook sessions?

Solution is very simple:

Go to the notebook  --> open the saved session --> then click on below highlighted 


then click recreate, now you should be able to re-run the code.