formal definition of a delta

classic Classic list List threaded Threaded
5 messages Options
rvs
Reply | Threaded
Open this post in threaded view
|

formal definition of a delta

rvs
Hi!

I've just discovered Glu a week or so ago and so far I'm really impressed by
its design, implementation and most of all docs. Great work guys!

One thing that I could find an answer to is -- what is the formal definition
of a delta and how exactly deltas are handled by the Planner. I tried to look
at the raw deltas by calling the REST API at .../model/delta but the JSON
returned is a bit tough to understand without knowing what it is all supposed
to mean. Before I dive into the source code I thought I'd ask on the list
for any pointers to docs, etc.

Thanks,
Roman.
Reply | Threaded
Open this post in threaded view
|

Re: formal definition of a delta

frenchyan
Administrator
Hi Roman

Welcome to glu! :)

This is an interesting question and indeed I don't think I wrote a "formal" definition of the delta. So let me try to do give you an idea of what the delta computation does:

The delta is the "difference" between the static model (the one you load in the console) and the live model (the one computed by all the states of all the agents belonging to the fabric you are currently looking at). 


The way this object is computed is the following:

each entry in the model is "flattened" (meaning it is represented as a map where the key is the dotted notation of all values for that entry) and the value is "simple". Meaning the map has only one level (no nesting of other maps or arrays). Example:

agent: "agent-1",
initParameters.webapp.port: 1234
mountPoint: "/mp"
…etc...

Note that those keys are what is used on the dashboard to extract columns from the whole map.

Now you essentially have 2 maps and what you do is comparing the value for each key. If the value match, then everything is "fine". If they mismatch then there is a delta.

In the json, what you get is a map called "delta" which contains the delta for each entry in the model. Here is a real live output:

    "agent-1:/m3/i001": {
      "agent": "agent-1",
      "entryState": "running",
      "errorValueKeys": ["initParameters.message"],
      "initParameters.message": {
        "cv": "Hello World",
        "ev": "Hello World2"
      },
      "key": "agent-1:/m3/i001",
      "metadata.cluster": "c1",
      "metadata.container.name": "m3",
      "metadata.currentState": {"cv": "running"},
      "metadata.modifiedTime": {"cv": 1340644920253},
      "metadata.product": "product1",
      "metadata.scriptState.script.version": {"cv": "@script.version@"},
      "metadata.scriptState.stateMachine.currentState": {"cv": "running"},
      "metadata.version": "1.0.0",
      "mountPoint": "/m3/i001",
      "script": "file:/Users/ypujante/github/org.linkedin/glu/scripts/org.linkedin.glu.script-hello-world/src/main/groovy/HelloWorldScript.groovy",
      "state": "ERROR",
      "status": "delta",
      "statusInfo": "initParameters.message:[Hello World2!=Hello World]",
      "tags": [
        "a:tag1",
        "e:tag3",
        "e:tag4"
      ],
      "tags.a:tag1": "agent-1:/m3/i001",
      "tags.e:tag3": "agent-1:/m3/i001",
      "tags.e:tag4": "agent-1:/m3/i001"
    },

This output shows you the delta only for 1 entry (the where agent="agent-1" and mountPoint="/m1/i001"). You can find all the "dotted notation" keys I was talking about. When the static model and the live model match, then you only see the value as in:

      "metadata.product": "product1",

When there is a mismatch between the live model and the static model then you see:

      "initParameters.message": {
        "cv": "Hello World",
        "ev": "Hello World2"
      },

where "cv" means current value (live model) and "ev" means expected value (static model)

The output contains also a host of other keys, like "state", "status" and "statusInfo" which represents in which "state" this entry is (in this example, in error and the "statusInfo" is a user friendly representation of why) or errorValueKeys which is an array of the keys that are in error.

Note that when you call the rest api you can provide "flatten=true", which essentially removes entirely the "cv"/"ev" values and pick only the one that should be used (and is exactly what the dashboard represents).

I understand that this is more an explanation of the delta than really a formal definition (not super trivial to do).

Hope this helps though and feel free to ask more questions if you have some. The best way to really understand the delta is to look at the dashboard in the console and run the json query (with flatten=true) and I think it will click.

Yan

On Sun, Jun 24, 2012 at 11:37 AM, rvs [via glu] <[hidden email]> wrote:
Hi!

I've just discovered Glu a week or so ago and so far I'm really impressed by
its design, implementation and most of all docs. Great work guys!

One thing that I could find an answer to is -- what is the formal definition
of a delta and how exactly deltas are handled by the Planner. I tried to look
at the raw deltas by calling the REST API at .../model/delta but the JSON
returned is a bit tough to understand without knowing what it is all supposed
to mean. Before I dive into the source code I thought I'd ask on the list
for any pointers to docs, etc.

Thanks,
Roman.


If you reply to this email, your message will be added to the discussion below:
http://glu.977617.n3.nabble.com/formal-definition-of-a-delta-tp4024766.html
To start a new topic under glu, email [hidden email]
To unsubscribe from glu, click here.
NAML

rvs
Reply | Threaded
Open this post in threaded view
|

Re: formal definition of a delta

rvs
Hi Yan!

First of all -- thanks for a really useful introduction into the
delta business. I think I'm starting to grok it now. A couple
of questions still:

On Mon, Jun 25, 2012 at 10:39 AM, frenchyan [via glu]
> Now you essentially have 2 maps and what you do is comparing the value for
> each key. If the value match, then everything is "fine". If they mismatch
> then there is a delta.

Question -- are all the mismatched flattened keys always supposed to
be collected into a errorValueKeys array? IOW, is the content of errorValueKeys
all I supposed to care about in the delta in order to zero-in on the actual
differences?

> where "cv" means current value (live model) and "ev" means expected value
> (static model)

In my case I'm also seeing:

        "metadata.scriptState.script.gcLog":{
           "cv":"file:/tmp/org.linkedin.glu.packaging-all-4.4.0/apps/hdfs/datanode1/logs/gc.log"
        },

This is puzzling on two accounts:
  1. there's no ev
  2. those keys (e.g. metadata.scriptState.script.gcLog) are missing
from errorValueKeys

Can you, please, help me understand what bring these types of
entries to life?

> Hope this helps though and feel free to ask more questions if you have some.
> The best way to really understand the delta is to look at the dashboard in
> the console and run the json query (with flatten=true) and I think it will
> click.

I think this is definitely staring to click! Here's a meta-question
thought: so far I'm really happy about the layering of the Glu
architecture in how it has the notion of an agent at the very bottom
and how everything else (scripts, actions, etc.) are built on top
of the previous layer. In this mental model it seems that what an
orchestration engine does is essentially computes a plan which is
nothing but a sequence of action calls on various scripts installed in
various agents.  IOW, in comes the delta, out comes the plan, right?

So here's the question -- what kind of hooks does Glu provide me with
so that I can customize this plan computation step? The default behavior (e.g if a
delta indicated a stopped service then a start action needs to be called) makes
sense, but how can I extend it? My primary interest is in extending it to
configuration management type of things. E.g. I would like to be able to have
all the configuration values for a Hadoop cluster in my static model and when
any of them change I would like a certain set of actions to be called.

Thanks,
Roman.
Reply | Threaded
Open this post in threaded view
|

Re: formal definition of a delta

frenchyan
Administrator
I know which piece of the equation you are missing :)

When the delta is computed only certain keys trigger an error (which in the UI will be rendered as a red row instead of a green row, or in the json will make it in the errorValueKeys array). Basically initParameters are part of the error computation, and metadata are not. This is why your key metadata.scriptState.script.gcLog is not marked as an error.

The formal definition is in the code (and should be added to the documentation for sure, adding a ticket for it): https://github.com/linkedin/glu/blob/master/orchestration/org.linkedin.glu.orchestration-engine/src/main/java/org/linkedin/glu/orchestration/engine/delta/impl/DeltaMgrImpl.java#L226

Essentially the keys "parent", "script" and "entryState" as well as all the keysthat start with "initParameters."

The code is done in such a way that is configurable but it is not exported yet as a first class citizen (for example you can also exclude some keys). This can also be added fairly easily.

In regards to "ev" (or "cv") missing, it is simply due to the fact that the json formatter ignore map entries where the value is null, or in other words, it's like having "ev": null. In your example this is actually a key that is coming from the live model (this specific key is representing the value of the gcLog field in your glu scritp!) and is not defined (and should not be!) in the static model hence "ev" is null.

Your understanding of the internal of glu is correct and you are summarizing what the diagram in the documentation is showing: http://linkedin.github.com/glu/docs/latest/html/orchestration-engine.html

At this point in time the customization that are available are about changing the state machine entirely (see http://linkedin.github.com/glu/docs/latest/html/glu-script.html#defining-your-own-state-machine). If a delta/error is detected, irrelevant of the state machine you define, it will always trigger the following transitions: <current-state> -> NONE -> <expected running state> which will run all the states in the state machine from the current state until the entire entry is fully undeployed and then reinstall everything and execute all the steps until the entry reaches the "expected running state" which is "running" by default (but can be changed). There is a ticket (https://github.com/linkedin/glu/issues/23) to add the notion of actionArgs to not go through an entire undeploy/redeploy cycle. It has not been implemented yet.

Best
Yan
rvs
Reply | Threaded
Open this post in threaded view
|

Re: formal definition of a delta

rvs
Hi Yan!

I can't thank you enough for a very clear and detailed explanation!
Much appreciated -- this has unblocked me for my future experiments
with Glu.

Thanks,
Roman.