Zookeeper authentication

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Zookeeper authentication

lukestephenon
Hello,

The glu agent zookeeper configuration doesn't appear to support authenticated connections.  In prod, the zookeeper cluster makes use of authentication / ACLs.

Normally I would authenticate with Zookeeper.addAuthInfo
http://zookeeper.apache.org/doc/r3.2.2/api/org/apache/zookeeper/ZooKeeper.html#addAuthInfo(java.lang.String, byte[])

Is this supported with glu?  If not is the norm just to use a zookeeper cluster with no security restrictions?

I'd have guessed this would have already been asked, but couldn't find on the forums or in the prod setup guide.  Apologies for the duplicate question if the answer is already on the forums.

Thanks

Luke
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Zookeeper authentication

frenchyan
Administrator
Luke, you are correct. glu does not support ZooKeeper authentication. If people are interested, I can certainly look into it.

Yan
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Zookeeper authentication

lukestephenon
Hi Yan, Is the worst case scenario that someone with bad intent could modify zookeeper affecting the actions the glu agent performs?  For example, changing the init-params and causing the agent to deploy a different process.  I'm trying to assess what security implication this has.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Zookeeper authentication

frenchyan
Administrator
The agent uses ZooKeeper for writing the state of the scripts it manages, it never reads it (except at boot time, see below for this "exception"). This is actually enforced by the fact that ZooKeeperStorage implements the WriteOnlyStorage api: https://github.com/linkedin/glu/blob/master/agent/org.linkedin.glu.agent-impl/src/main/groovy/org/linkedin/glu/agent/impl/zookeeper/ZooKeeperStorage.groovy#L36

The agent uses its own local file system to store the state of the script and read it back (on agent restart). So the use case you are describing cannot happen.

The only time the agent reads from ZooKeeper it is at boot time, because it reads a portion of its configuration from it. This is defined in agentConfig.properties (https://github.com/linkedin/glu/blob/master/agent/org.linkedin.glu.agent-impl/src/main/groovy/org/linkedin/glu/agent/impl/zookeeper/ZooKeeperStorage.groovy#L36). Note that this step is optional and you can put whatever URL you want there... by default it reads from ZooKeeper but it does not have to be this way. One thing that the agent reads in ZooKeeper during this phase is the public key of the console. So I can imagine somebody changing the key in zookeeper so now it trusts another client to execute commands. This would require restarting the agent since the agent does not reload its configuration until restart. Which would require to have access to the host to restart the agent. So if you have access to the host to restart the agent, I am not sure what you would gain in having the capability to tell the agent to execute a command since you are already on the host and can simply execute the command yourself... But again, this is configurable and if you are uncomfortable storing config properties in ZooKeeper for agent boot, then simply store them either locally or on an http server and change the URL in the agentConfig.properties file.

In my mind, the worst case scenario that could happen would be somebody simply deleting everything in ZooKeeper (rm -r /org/glu). That would create some chaos in your deployment/release team for sure as the console suddenly would be red for everything because the state has been wiped out. But your applications installed by glu would still be up and running. Like I said before the agent is only writing into ZooKeeper not using it for any purpose (besides boot). To recover from this, you would have to issue a Agent.sync() on every agent so that they rewrite their state in ZooKeeper. You can issue this command with the agent-cli (--sync option): you would have to loop over every single one of your agent and issue the command to all of them. I guess there would probably more harm done by the panic following everything red in the console that what really happened :)

Yan
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Zookeeper authentication

lukestephenon
thank you for the detailed explanation.  


On 18 December 2012 18:28, frenchyan [via glu] <[hidden email]> wrote:
The agent uses ZooKeeper for writing the state of the scripts it manages, it never reads it (except at boot time, see below for this "exception"). This is actually enforced by the fact that ZooKeeperStorage implements the WriteOnlyStorage api: https://github.com/linkedin/glu/blob/master/agent/org.linkedin.glu.agent-impl/src/main/groovy/org/linkedin/glu/agent/impl/zookeeper/ZooKeeperStorage.groovy#L36

The agent uses its own local file system to store the state of the script and read it back (on agent restart). So the use case you are describing cannot happen.

The only time the agent reads from ZooKeeper it is at boot time, because it reads a portion of its configuration from it. This is defined in agentConfig.properties (https://github.com/linkedin/glu/blob/master/agent/org.linkedin.glu.agent-impl/src/main/groovy/org/linkedin/glu/agent/impl/zookeeper/ZooKeeperStorage.groovy#L36). Note that this step is optional and you can put whatever URL you want there... by default it reads from ZooKeeper but it does not have to be this way. One thing that the agent reads in ZooKeeper during this phase is the public key of the console. So I can imagine somebody changing the key in zookeeper so now it trusts another client to execute commands. This would require restarting the agent since the agent does not reload its configuration until restart. Which would require to have access to the host to restart the agent. So if you have access to the host to restart the agent, I am not sure what you would gain in having the capability to tell the agent to execute a command since you are already on the host and can simply execute the command yourself... But again, this is configurable and if you are uncomfortable storing config properties in ZooKeeper for agent boot, then simply store them either locally or on an http server and change the URL in the agentConfig.properties file.

In my mind, the worst case scenario that could happen would be somebody simply deleting everything in ZooKeeper (rm -r /org/glu). That would create some chaos in your deployment/release team for sure as the console suddenly would be red for everything because the state has been wiped out. But your applications installed by glu would still be up and running. Like I said before the agent is only writing into ZooKeeper not using it for any purpose (besides boot). To recover from this, you would have to issue a Agent.sync() on every agent so that they rewrite their state in ZooKeeper. You can issue this command with the agent-cli (--sync option): you would have to loop over every single one of your agent and issue the command to all of them. I guess there would probably more harm done by the panic following everything red in the console that what really happened :)

Yan



If you reply to this email, your message will be added to the discussion below:
http://glu.977617.n3.nabble.com/Zookeeper-authentication-tp4025290p4025303.html
To unsubscribe from Zookeeper authentication, click here.
NAML

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Zookeeper authentication

lukestephenon
I've given this some further thought recently and the lack of zookeeper authentication is still concerning me.

1. The console (ZookeeperTreeTracker) can be brought down by writing large amounts of data into the monitored zookeeper tree (causing an OutOfMemoryException)
2. Anyone can start an agent that will join a cluster, and name it so that it appears to be part of the environment.  That server may then be sent production configuration (if the glu script has init params).
3. I recently wanted to provide all agents with some additional private configuration which a glu script requires to connect to another system.  I would have liked to have put this configuration into once place (zookeeper), but because that is not secured, I need to modify pre_master_conf.sh for each agent to pass it as a JVM arg.
4. I can't use initParams to pass any private configuration, as this will be publicly visible in zookeeper.

Any suggestions on how to work around these points?

Thanks

Luke
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Zookeeper authentication

frenchyan
Administrator

In regards to your concerns about ZooKeeper security yes I do agree. I guess one of the issue is that if you are going to require a username/password to talk to ZooKeeper then the agent needs to know about it. So that means the ZooKeeper username and password is on every agent. But I agree that it could be encrypted/obfuscated the same way the keys are. This was very impractical before the "easy" setup since everything was manual. I suppose if you could simply add a zookeeper username and password in the meta model and have the setup process take care of it, it would be pretty sweet. That means changing the utils-zoookeeper project to support username/password, as well as the meta model and setup. Quite a bit of work. I submitted a ticket for it: https://github.com/pongasoft/glu/issues/265


In regards to your specific use case (3 and 4), I believe there is a solution:

Under the Admin tab, there is an Encryption Keys section:

1) create a key (you simply give it a name)

2) go to encrypt/decrypt text

Enter the plain text string (make sure that your encryption key is selected in the drop down) and click encrypt => in the output area you will see the result of the encryption. You can simply use this value in your model. Example: I have encrypted the string "Hello World".

In my model I have this:

"initParameters": {

    "message": "Encrypted-AES/CBC/PKCS5Padding(2AH69mj7wNGtZ6ZvFu_j_T,0snIjqHsu1Ex7gMIqnjXa2,cmJM)"

  },


Then in my glu script I have this:

  def start = { args ->
    log.info("From start => $params / $args")
    
    def message = org.linkedin.groovy.util.encryption.EncryptionUtils.decrypt(params.message, args.encryptionKeys)
    
    log.info("decrypted => $message")
  }

The output is the following:

2014/06/08 07:31:36.547 INFO [/m1/i001] From start => [message:Encrypted-AES/CBC/PKCS5Padding(2AH69mj7wNGtZ6ZvFu_j_T,0snIjqHsu1Ex7gMIqnjXa2,cmJM), tags:[a:tag1, e:tag1, e:tag2], metadata:[product:product1, container:[name:m1], cluster:c1, version:1.0.0]] / [encryptionKeys:[*** MASKED ***]]

2014/06/08 07:31:36.547 INFO [/m1/i001] decrypted => Hello World

Several points: 

* All the encryption keys created are automatically passed in the args of every phase
* the output in the log file is always masked (log4j.xml contains rules for this, so don't remove those rules :))
* args are not stored in ZooKeeper unlike initParams, or in other words the keys are never really visible anywhere
* the encrypt string tab in the UI does not store anything: it is a just a utility to generate the encrypted version of a string that you can use anywhere
* at this time you need to use an internal class to decrypt (EncryptionUtils) as this is not exposed in the shell (only method exposed is untarAndDecrypt which is not what you want) and should be promoted
* I did not personally work on this part of the code which is why I did not remember about it :) 

Hope this helps
Yan
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Zookeeper authentication

lukestephenon
Thanks Yan.  I hadn't noticed the encryption keys functionality.  I'll consider using this.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Zookeeper authentication

frenchyan
Administrator
I wanted to follow up on this and whether you still think that adding ZooKeeper username/password would help or not.

Thanks
Yan
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Zookeeper authentication

lukestephenon
I still think there is benefit in adding the zookeeper auth to the agents.  

Anyone else want to see this?

Thanks

Luke
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Zookeeper authentication

sodul
+1 for adding authentication as an option.

More security is a good thing, even if only to prevent someone from deleting the glu data just by curiosity.

Btw, we use Netflix exhibitor along our zk instances to help manage them ... we even use Glu to deploy the zk instances used by Glu (chicken and eggs).
EB
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Zookeeper authentication

EB
CONTENTS DELETED
The author has deleted this message.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Zookeeper authentication

frenchyan
Administrator
In reply to this post by sodul
I am certainly fine working on this since there seems to be some interest. I am curious if you have any opinion on how to implement it. The issue I can see is that as of right now the set of keys used by glu to decrypt passwords, etc... are stored in ZooKeeper so it is kind of a catch 22: I cannot access ZooKeeper to get the credentials for ZooKeeper. So how would the agent know about the credentials for ZooKeeper? Knowing that every agent needs to have access to them and it somehow needs to be secure...

Yan
EB
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Zookeeper authentication

EB
CONTENTS DELETED
The author has deleted this message.
Loading...