Analyzing the data

Whether your agent is powered by the state-of-the-art reinforcement learning algorithms, a massive LLM sitting somewhere in the cloud, or by a group of contractors in some distant country, you will need to analyze the data to tune your agent for better results. Even though you will likely have your own analytics machinery (such as Tensorboard), AICA challenge provides you with data and logs to link your agent's analytics to the objective state of the environment.

The data is stored in two places:

log files in the log/ directory,
database in the aica_challenge.db file.

Logs¶

Textual logs are generally intended only for checking out what happened, not for any serious analysis. The log files are located in the log/ directory and have a form of cyst_<type>-<run_id>.log, where <type> is one of messages and system and the <run_id> relates to a specific CYST run ID of a particular episode.

System logs are set to an INFO-level logging, so they record only the more important stuff, such as episode termination due to crossing an action or time threshold, or encountered exceptions. If everything is going well, you can easily end with this log being empty.

Message logs record messages as they traverse through the system. As such there will be a plenty of entries for any non-trivial scenarios. Each message hop from one machine to another and the processing they are undergoing is there, so for any reasonable analysis, you better use the database, because the messages are there as well. Nevertheless, they are often useful to find the last message that caused a catastrophy.

For everything else, there is the...

Database¶

AICA challenge stores most of the runtime data in the SQLite3 database. Here you can also find the run specifications, but you should not need to tamper with them and here we only focus on the analytical data.

If you execute at least one run, the database is populated and contains some data. Please note that this database is not a storage that AICA challenge or CYST would use to track or store its internal state or intermediate results. The database is used purely for user purposes, and we tried to make the data structure as friendly as possible.

When indicated by the (CYST) string, this is a table produced directly by CYST and not specially the AICA challenge. Therefore, they may be some duplication and terminology clashes (especially with the word 'run').

Run statistics¶

Table: challenge_run_statistics
Fields:
- id: A primary key.
- status: The status of a particular run. If you see it as a RunStatus.RUNNING and all processes are terminated, it is most likely that the run was terminated prematurely and with some error. If, however, you see it as RunStatus.FINISHED, the run ended correctly and the details field will be filled.
- details: Textual summary of the run. It has the information about successful and failed episodes.
- specification_id: An ID of run specification that was used for the run.

This table can give you the high-level overview of your run statuses. In general, if the run is finished, you can glance on whether all episodes were successfully. In other cases, you can use this table as a pointer to in your search for what went wrong. But most of the time, all the information is in episode statistics.

Episode statistics¶

Table: challenge_episode_statistics
Fields:
- id: A primary key.
- episode_number: The number of episode within one run. These episode numbers are referenced in the previous table in the details section.
- stdout: Complete stdout record for the episode.
- stderr: Complete stderr record for the episode.
- cyst_run_id: CYST run id that you can use to relate this episode to CYST data tables (in general, not prepended with the challenge_ string).
- status: The status with which the episode ended. As before, if you see it in the RunStatus.RUNNING state when all processes have terminated, something must have gone wrong.
- run_id: The id of a run this episode belongs to.

This table provides you with the overview of particular episodes within a run. As the console output is suppressed during a parallel execution, you can find a complete standard output and standard error streams here. Probably the most important piece here is the cyst_run_id, which lets you connect this episode to CYST data tables.

Episode statistics (CYST)¶

Table: statistics
Fields:
- id: A primary key.
- run_id: A run id for a particular episode. Mapping to cyst_run_id of the previous table.
- configuration_id: Not used.
- start_time_real: A real time when the episode started.
- end_time_real: A real time when the episode ended.
- end_time_virtual: A virtual timestamp when the episode ended.

This table supplements (or rather it is the other way round) the challenge_episode_statistics table and provides timing information about the run of episodes. Real times are in floating timestamp, and you can either convert it in Python, or you can use a service, such as epochconverter to do it for you, if you are just glancing at the data.

Action records (CYST)¶

Table: action
Fields:
- id: A primary key.
- message_id: An ID of a message carrying the action.
- run_id: An ID of a CYST run where this message was sent.
- action_id: A name of the action.
- caller_id: A name of the agent that executed this action.
- src_ip: A source IP address of the action message.
- dst_ip: A destination IP address of the action message (i.e., the target of the action).
- dst_service: A target service of the action (can be empty if no service was set).
- status_origin: An origin of the status in response to the action (e.g., StatusOrigin.SERVICE)
- status_value: A value of the status in response to the action (e.g., StatusValue.SUCCESS)
- status_detail: A detail of the status in response to the action (can be empty, e.g., StatusDetail.SERVICE_NOT_PROVIDED)
- response: A content of the response. Usually a string containing the error message or a JSON structure with data if the action was a success.
- session_id: The ID of a session that was used in the request.
- session_out The ID of a session that was used in the response. If an action does not create a new session, this is the same as session_in.
- auth_in: An identification of authorization/authentication that was used in the request.
- auth_out: An identification of authorization/authentication that was used in the response. If an action does not create a new authorization, this is the same as auth_in. Authentication tokens are passed in the response field.

This table gives an account of all high-level actions that were executed. It puts together information from a request and a response. Some actions, for example ac1:scan_network are a composite actions that consists of many other actions (in this case ac1:scan_host) that are executed in its context. This table is not tracking these subordinate actions to make it easier to analyze the success of agents' behavior. If you want to analyze these actions as well, you need to consult Message records.

Action parameter records (CYST)¶

Table: action_parameter
Fields:
- id: A primary key.
- name The name of the parameter.
- value: A stringified value of the parameter.
- action_id: A reference to the action from the previous table.

This table tracks all the parameters that were supplied to the actions. If you want to extract actions together with their parameters, you will need to use a bit of SQL to convert known parameter names to columns. For example:

select action.*,
       max(case when action_parameter.name == 'net' then action_parameter.value END) as param_net,
       max(case when action_parameter.name == 'path' then action_parameter.value END) as param_path
from action
left join action_parameter on action.id = action_parameter.action_id
group by action.id

Message records (CYST)¶

Table: message
Fields:
- id: A primary key.
- message_id: An ID of a message in a particular episode.
- type: The type of the message (e.g., MessageType.REQUEST).
- run_id: An ID of a CYST run where this message was sent.
- action_id: The name of the action this message carries.
- caller_id: An ID of the agent that sent the message.
- src_ip: A source IP of the message.
- dst_ip: A destination IP of the message.
- dst_service: A destination service of the message.
- ttl: A time-to-live attribute of the message. Most of the time not useful, but can help you track, how much a particular message has traveled.
- status_origin: An origin of the status if the message type is a response (e.g., StatusOrigin.SERVICE)
- status_value: A value of the status if the message type is a response (e.g., StatusValue.SUCCESS)
- status_detail: A detail of the status if the message type is a response (can be empty, e.g., StatusDetail.SERVICE_NOT_PROVIDED)
- session: A session ID of the message.
- auth: Authentication/authorization identification the message is carrying.
- response: A response content if the message type is response.

This table tracks messages exchanged between actors of particular runs. It tracks them at the point when the message is sent. Therefore, if you want to track the message during its entire journey through the system, you need to turn to the following table.

Platform specific message records (CYST)¶

Table: platform_specific
Fields:
- id: A primary key.
- name: A name of a platform-specific parameter.
- value: A stringified value of a platform-specific parameter.
- message_id: A reference to the message from the previous table.

CYST supports different execution platforms (such as executing the runs in a Docker-based emulation environment). Each such platform can supply information with different granularity to the message. For example, the Docker platform is not able to track the messages as they are going through the network, but the CYST simulation platform is. This table is then used to track these different data points.

In AICA challenge, you can be certain that the CYST simulation platform is used. And this platform provides information about each hop of a message, that is, the IP and ID of a machine the message hops from and hops to. You can combine this information with the previous table with a bit of SQL. The result will then contain all the messages as they were traversing the network:

select message.*,
       max(case when platform_specific.name == 'current_hop_ip' then platform_specific.value END) as current_hop_ip,
       max(case when platform_specific.name == 'current_hop_id' then platform_specific.value END) as current_hop_id,
       max(case when platform_specific.name == 'next_hop_ip' then platform_specific.value END) as next_hop_ip,
       max(case when platform_specific.name == 'next_hop_id' then platform_specific.value END) as next_hop_id
from message
left join platform_specific on message.id = platform_specific.message_id
group by message.id

Signalling (CYST)¶

Table: signal
Fields:
- id: A primary key.
- run_id: An ID of a CYST run where this signal occured.
- signal_origin: The source of the signal, usually either __environment or agent's ID.
- state: A new state the signal origin has entered.
- effect_origin: An identification of the entity that prompted this state change.
- effect_message: An identification of a message that prompted the state change, or -1 if there are no message that the state change can be attributed to.
- effect_description: A description of what caused the state change, or any other data (in case of agents signalling that they reached the goal).

This table tracks the signalling communication between the environment and its components and agents. In most cases, you will be using this tabel only to review the signals your agents were sending.