Building the Correlator #2: Window architecture

In the previous article, I described what the window correlator is good for, analyzing long series of data to gather security incidents and produce alerts. I also discussed the memory issue that comes with storing long series of data composed of events, which come to the application at a frequency of tens of thousands of events per second. The proposed solution is to store only relevant parts of the series, here called segments, in memory. Now, we'll take a look at the segments in more detail, so we know exactly what structures they are composed of.

Predicate and evaluate phases: Organizing incoming events

First, take a look at the following picture:

Window correlator architecture

When a relevant event comes to the correlator application and is accepted by the correlation rule in the predicate (filter) phase, an evaluation begins. Based on the primary dimension, defined in the given correlation rule (such as user name, device ID etc.), the event is inserted into the data series. The place of its insertion is defined by the timestamp, which is part of the event and specifies the exact time the event originated. If there is an available segment, it checks its start time and end time. If the event time matches the time range defined by the segment, the event is stored inside the segment or, more specifically, inside a cell of the segment. Each segment has multiple cells. The cell size is defined by the parameter called resolution and it can be any length of time, from a single second to several days. Each cell for a single correlation rule is the same length of time, so by the start time and the count of cells inside the segment, we can calculate the segment's end time.

The cells are the part of the segment that is stored in the memory-mapped file, here called a shard. There is another memory-mapped file that holds attributes from the event itself, such as ID. This second file is connected with the shard and is called the dynamic file. Each cell then contains information about offset in the dynamic file, where it stores its list of events. The cell itself may also contain numeric metrics calculated from the events, such as their sum/count, mean spike, median and so on based on the definition of the correlation rule. When the information about the event is inserted in a given cell and some of its attributes in the dynamic file, the evaluation phase ends. The row with its segments is synchronized to LevelDB to reconstruct the same memory structure after a possible restart of the application. In the previous article, I mentioned the issue of storing pointers to a memory mapped from a file, so LevelDB comes here for help to store offsets of segments that are used to reconstruct the pointers after restart.

Analysis phase: The sliding window starts its work

The last phase is called analysis and is triggered either after an event is inserted into memory, or periodically after a defined amount of time. In both cases, the sliding time window is constructed to look at some amount of cells, determined by the correlation rule. For example, if we wish to analyze failed user login attempts from the past five minutes, we can set the resolution (and thus cells) to one minute. Thus, the sliding time window needs to cover five of these one-minute cells at once, for a total of five minutes. There may be more sliding time windows constructed, each starting at some time before or at the changed cell, to ensure all cases are covered.

Segments and cells

Once the time window is prepared, it executes an aggregator function such as sum or unique count. Utilizing the row iterator object, it traverses all encapsulated cells (in our example, always five). The row iterator iterates through the segment and if it reaches the time gap, it simply calculates the number of missing virtual cells (green cells in the picture above) and continues the iterations over them – the metrics of the virtual cells are always empty or equal to zero. When the gap is over, it continues to another segment and so on, until the requested number of cells is reached.

Test and trigger phases: Detecting and taking action

The aggregator then passes the metrics to the window, which it gives to the analysis function for the test. If the result of the aggregator meets the test conditions, such as a sum of failed logins exceeding ten, an alert is triggered. If not, another time window is constructed until all possible cases are covered.

Flexibility

That is simply how the window correlator, an application utilizing time windows, works. As you can see, there is a lot of opportunity for parametrization, so that most known security incidents can be covered. The code is implemented in C to optimize performance, given the potential for thousands of iterations per second. There may be more of the shards and dynamic files, based on the size of the data series, each containing information about the cells and events inside a given segment and row. The alerts produced by the window correlator are then handled by alert management and notification services, so that every security incident is tracked and acted upon.

Want to know more?

Visit the previous article about building the Correlator, or our beginner-friendly introduction to event correlation.

About the Author

Premysl Cerny

Software Developer at TeskaLabs




You Might Be Interested in Reading These Articles

MazelTov and the Russian Underground Have It Going for Your Android Devices. But Not for Good Reasons

The Internet has been a good place for individuals and businesses. However, it's fast-becoming a leading medium for criminals in this cyber war against people like you and I. One example is the Russian underground that sell anything to do with cyber crime. On their websites, you can find any type of Trojans, exploits, rootkits and fake documents.

Continue reading ...

security

Published on May 19, 2015

How DDoS Attacks Can Sink Your Business

Distributed Denial of Service (DDoS) is a form of cyberattack which makes the target internet service inaccessible. “Distributed” refers to the fact that the attack comes from multiple sources, to have a bigger impact on the target, as it cannot cope with such a large amount of traffic. In recent years, DDoS attacks have become more and more complex, with many combinations of different attach approaches being used.

Continue reading ...

security

Published on February 07, 2017

Security Architect Jiri Kohout: It's up to Us to Define How Secure The Internet Will Be

The security of connected applications, IoT, or mobile platforms, is based not only on secure development, but also on widespread knowledge about info security. Every user should have minimum knowledge about security. Every public tender should demand security of the final product or service.

Continue reading ...

interview security

Published on September 15, 2015