Running the challenge

If you select the third option in the menu, you will be taken to the main interface for preparing and running the challenge.

The key concept here is the run specification. It carries the necessary parameters to correctly execute scenarios with the agent you would like.

Creating a first run specification¶

When you first start the challenge, there are no specifications available, so let's begin with creating a new one by selecting the first option.

As you can see, a dialog pops at you, but because we do not have any run specification ready yet, just select no and we will get back to it.

A run specification consists of several parameters and in the menu, you can set them one after another:

name: represents a unique identifier of the specification. If you were to choose an already present name, you would overwrite the existing specification.
description: is there for you to help you organize the run specifications, if you have too many of them. You can put here any additional info that will help you distinguish among the specifications.
agent is the agent that will be used for a challenge run. There can be only one agent for each challenge run, and you will be given an option to choose an agent from those installed in the system. If you remove the agent after you create a specification dependent on that agent, you will encounter errors. So... don't do that.

scenario is the scenario which your agent will be trained/tested on. In addition to a specific choice, you can let the system choose scenario at random. Be aware that the scenario will be chosen at random for every episode of a run.

variant is the scenario variant to use. As with the scenario, you can either choose one specific or let the system decide randomly. If you have chosen a random scenario, your only option will be to have a random variant, because there are different numbers of variants for each scenario.
run parameters enable you to fine-tune your challenge runs. The currently available parameters are:
- max_time representing the maximum number of virtual seconds that will elapse before the episode is terminated.
- max_actions representing the maximum number of actions by one agent. After this threshold is reached, the episode is terminated.
- max_episodes representing the maximum number of episodes that are executed, before the run is concluded.
- max_parallel representing the maximum number of episodes that can run in parallel. While the simulation is lightweight and can run many parallel instances, be mindful of your agent's requirements and set a reasonable amount.

When you set your parameters, you can view the run specification by selecting the View template from the menu and you will see something like this:

Finally, if you choose the Save option, the run specification will be saved to the database and can be used from then on.

Creating and editing other specifications¶

When you have at least one specification saved, you can use it as a template for other specifications. When you create a new specification and opt to use a template, you will be given an option to select one of the existing specification and a copy will be made for you. You then only need to change the parameters to your liking.

The similar applies for editing existing templates.

Running a challenge according to a specification¶

If you choose the New run option in the menu,

you will have to select the run specification that you want to execute,

and if you choose one, you will face a final confirmation prompt:

Press Yes and the run will start.

Checking the state of runs¶

The TUI only keeps the track of the runs executed when it was running. If you want to check some historic runs, go to the next section.

If you select the Run status option from the menu,

you will get an overview of the runs that are currently underway, or those that already finished within the current TUI process:

If you choose a one specific run, you will get detailed information about that particular run:

Here you can see, which episodes are planned for execution and which episode finished and how. This screen is obviously useful only up to a handful of episodes. If you plan to use much larger number of episodes or runs, it is better to directly use the Challenge class from aica_challenge_1.

If you tried to exit the TUI, while any run is still underway, you would be greeted with this screen:

If you select Yes, the running and future episodes of each executed run will be forcibly terminated and the amount of information in logs and database may vary. So, be aware of that.

Analyzing the runs¶

Now that you know a run has (un-)successfully finished, it's time to analyze, what happened and how well your agent handled what was thrown in front of it. Continue to the Analyzing the data section.