SPNEGO Protected Hadoop Web Consoles Behind Proxy (ex. NameNode web console)

This post gives a quick overview on how Kerberos (SPNEGO) protected Hadoop web consoles (like NameNode, ResourceManager, JobHistoryServer, Hive, Oozie web interfaces) work and how they should be configured when a proxy is installed in front of them.

Terms

Keytab: Consider the keytab (key table) as a file containing list of records of Kerberos principals, encrypted passwords, and a few other things. The encryption (one-way hash) prevents passwords from being stored in plain text format, so even a privileged user having access to all files on the file system cannot read it. Basically we do not make a huge mistake if a keytab is treated as a sort of credential store used for Kerberos authentication of non-human entities. A keytab entry never expires, but gets to be invalid as soon as the principal’s password changes.

Authentication Service (AS/KDC): Provides single sign-on service by issuing TGT (Ticket Granting Ticket) based on user credentials. TGT is a session ticket solely to KDC and used each time the client connects to KDC subsequently, after the initial authentication. TGT expires according to the locally configured Kerberos policy, but can be renewed before expiration if needed.

Ticket Granting Service (TGS/KDC): TGS/KDC is the issuer of service tickets, which let the client call Kerberos protected services.

Kerberos Protected Web Consoles

Assume that the web console of interest is running on hostname.example.com and within the EXAMPLE.COM Kerberos realm. What happens when a browser goes to a Kerberized web console on a Windows client?

  1. The user logs on the Windows Domain from the workstation and retrieves a TGT from AS/KDC. As mentioned above, TGT is a token to KDC and reused each time the a client application running on the workstation (e.g., a browser) requests TGS/KDC for a service ticket. TGT is securely cached in memory (never saved in a file) for further KDC interactions.
  2. The browser sends a request for a Kerberos protected Hadoop web console.
  3. The authentication handler of web console intercepts the request processing pipeline on service side and sends back an HTTP 401 - Unauthorized including the header line Authenticate: Negotiate in response.
  4. The browser looks up its local list of Kerberos protected URIs (e.g., network.negotiate-auth.trusted-uris in Firefox). If the host name of web console is present in the list of trusted URIs, the browser compiles and takes the HTTP/<hostname.example.com>@EXAMPLE.COM service principal name (SPN) along with the TGT from Step 1. and requests TGS/KDC for a session ticket to the web console running on hostname.example.com. (The format of SPNEGO SPN is specified in RFC 4559.)
  5. The browser resends the request to the Kerberized web console with the session ticket (and a few other lines) in HTTP header.
  6. The authentication handler on service side decrypts and verifies the session ticket. If valid ticket is presented by the browser, the access to console gets granted, otherwise, rejected. In order to verify the ticket, the service should know which session tickets it must accept. By default, Hadoop web console keytabs are configured to accept tickets to <servicename>/<hostname.example.com>@EXAMPLE.COM and HTTP/<hostname.example.com>@EXAMPLE.COM SPNs. (E.g., NameNode accepts service tickets to hdfs/<hostname.example.com>@EXAMPLE.COM and HTTP/<hostname.example.com>@EXAMPLE.COM, but rejects all inbound requests containing tickets to any other SPN.)

Kerberized Web Consoles Behind Proxy

However, what happens when a load balancer (e.g., an HAProxy instance) is configured in front of a Kerberized web console and the browser sends the requests to proxy-hostname.example.com instead of hostname.example.com?

Not surprisingly, the user will be prompted for its credentials. This happens because the web console’s SPN compiled by the browser is HTTP/<proxy-hostname.example.com>@EXAMPLE.COM, but no such SPN exists in KDC. Moreover, even if the SPN existed in KDC, the web console would reject the tickets to this principal as the service is not configured to accepted them.

To fix the authentication issue, the following main steps need to be done.

  1. Create HTTP/<proxy-hostname.example.com>@EXAMPLE.COM SPN within the EXAMPLE.COM realm.
  2. Get and merge the keytab of HTTP/<proxy-hostname.example.com>@EXAMPLE.COM with the web console’s keytab on service side.
  3. Modify the service configuration to accept all tickets of SPNs listed in web console’s keytab.

Configuring NameNode Web Console

In case of NameNode web consoles, it means the execution of the next steps.

  1. Configure the proxy to forward proxy-hostname.example.com:50070 to namenode-hostname.example.com:50070.
  2. Create HTTP/<proxy-hostname.example.com>@EXAMPLE.COM SPN in AD or MIT KDC and get the SPN keytab file. This is easy if you use MIT KDC, but a bit more difficult when AD provides the KDC. For both cases, have a look at how Cloudera Manager does it:
  3. On both NameNodes execute the following commands as root user.
    $ dd if=/dev/urandom of=/etc/security/http_secret bs=1024 count=1
    $ chown hdfs:hadoop /etc/security/http_secret
    $ chmod 440 /etc/security/http_secret
    $ mkdir -p /etc/security/keytab
    

    Copy the keytab file of HTTP/<proxy-hostname.example.com>@EXAMPLE.COM to /etc/security/keytab/http.keytab.

    $ chown root:root /etc/security/keytab/http.keytab
    $ chmod 600 /etc/security/http.keytab
  4. Locate the HDFS keytab file and copy to /etc/security/keytab/hdfs.keytab by preserving the original ownership and permissions (cp -a). For a Kerberized cluster this file should already contain the entries of hdfs/namenode-hostname.example.com@EXAMPLE.COM and HTTP/namenode-hostname.example.com@EXAMPLE.COM.
  5. Merge http.keytab and hdfs.keytab and set the appropriate permissions.
    $ ktutil
    ktutil: rkt /etc/security/keytab/http.keytab
    ktutil: rkt /etc/security/keytab/hdfs.keytab
    ktutil: wkt /etc/security/keytab/ha-hdfs.keytab
    ktutil: quit
    
    $ chown hdfs:hdfs /etc/security/keytab/ha-hdfs.keytab
    $ chmod 600 /etc/security/keytab/ha-hdfs.keytab
  6. After making backup, add/modify the following lines to hdfs-site.xml on both the active and passive NameNodes:
    <property>
        <name>hadoop.http.authentication.kerberos.principal</name>
        <value>*</value>
    </property>
    <property>
        <name>hadoop.http.authentication.kerberos.keytab</name>
        <value>/etc/security/keytab/ha-hdfs.keytab</value>
    </property>
    <property>
        <name>hadoop.http.authentication.simple.anonymous.allowed</name>
        <value>true</value>
    </property>
    <property>
        <name>hadoop.http.authentication.type</name>
        <value>kerberos</value>
    </property>
    <property>
        <name>hadoop.http.filter.initializers</name>
        <value>
            org.apache.hadoop.security.AuthenticationFilterInitializer
        </value>
    </property>
    <property>
        <name>hadoop.http.authentication.signature.secret.file</name>
        <value>/etc/security/http_secret</value>
    </property>
    <property>
        <name>hadoop.http.authentication.cookie.domain</name>
        <value>example.com</value>
    </property>
  7. Restart NameNode daemon processes.
  8. Verify if it works.
    $ kinit <your-username>@EXAMPLE.COM
    Password for <your-username>@EXAMPLE.COM: ********
     $ curl --negotiate --user : --request GET 
            'http://<proxy-hostname.example.com>:50070/dfshealth.html'

The above procedure assumes that the SPN passwords never expire. As a Kerberos keytab entry is functionally equivalent to a password, once the SPN password changes the keytab file gets invalid and Hadoop services go red.

Hope it helps.

Vélemény, hozzászólás?

Az email címet nem tesszük közzé. A kötelező mezőket * karakterrel jelöljük.