CREATING REUSABLE VIRTUAL MACHINES TO SIMULATE NETWORKS FOR CYBER CHALLENGES

Cyber challenges are a way for educators to train and assess the desperately needed cyber security professionals of the future. Creating these challenges can be time consuming, error prone, and/or cost-prohibitive. To address these issues, the University of Rhode Island has created the freely-available Open Cyber Challenge Platform (OCCP). To maximize ease of use and re-use of the challenges created on this platform the OCCP required a tool to automate the configuration and deployment of virtual machines. This thesis project created such a tool that allows OCCP challenge implementers to create virtual machines that are reconfigurable and sharable with the OCCP community.

There does not exist a tool to create reusable virtual machines for cyber challenges, which makes cyber challenges difficult to create and reuse. This thesis project extends the Open Cyber Challenge Platform by creating a tool to automate the configuration and deployment of virtual machines in cyber challenges.

Motivation
The market for cyber security jobs is large and continuing to grow. According to Burning Glass International Inc., a Boston-based company that uses artificial intelligence to match jobs and job seekers, cyber security postings have grown 74% from 2007 to 2013. Twice as fast as all IT jobs. The report also indicates that the demand for cyber security talent is exceeding the supply. These job postings took 36% longer to fill than all job postings [1]. There is a dire need to educate more people to fill these positions.
Many agree that cyber challenges are a useful tool for education [2,3,4,5] because they allow participants to both learn and test their abilities in a hands-on manner. Cyber challenges simulate real world networks, and allow participants to practice their skills without harming production systems. However, creating a cyber challenge is a difficult and time-consuming process, and the efforts generally have limited use after the initial offering of the challenge. Challenges require a high degree of technical ability to create and take significant time to plan and implement. Once they are created, they require additional time and effort to change subtle details for reuse.
The Open Cyber Challenge Platform (OCCP), being developed by the Uni-versity of Rhode Island under funding from the U.S. National Science Foundation, aims to provide a common platform on which challenges could be run. This thesis project extends the OCCP to facilitate reuse and shareability amongst the platform's users. Among these users are different audiences with varying technical abilities. These audiences will make use of this work based on their skill level.

Goals
The goals of this thesis project are: Goal 1: Create an application and administrative virtual machine (Admin VM) to configure virtual machines for the OCCP • The Admin VM should be able to configure virtual machines by performing tasks such as the installation of software, creation of user accounts, configuration of services, transfer files and other content.
• The Admin VM should provide a mechanism to produce automatically generated content such as usernames, passwords, and SSH keys for use in the configuration of the virtual machines.
Goal 2: The extended Admin VM will have reasonable expectations of technical ability for the target audiences.
• The extension should build on the existing syntax used by current Admin VM and extend it by providing the ability to describe the configuration of the desired virtual machine.
• The extension should only include components that are reasonable for different audiences with varying technical skill to operate.

Goal 3: Avoid the introduction of a financial cost
The extension of the Admin VM should not introduce a financial cost to extend the OCCPs current functionality.
Goal 4: The extended Admin VM should provide a distribution mechanism with reasonable expectations of ability for the target audiences The Admin VM should provide a mechanism by which the virtual machines that it produced could be shared with other OCCP users. The mechanism should consider the skill levels of the target audiences to ensure usability. project seeks to increase the number of qualified students entering the fields of information assurance, cyber security, and digital forensics, and to broadly increase the capacity of U.S. higher education to produce professionals in these fields. [2] Goals of the OCCP are to be free, configurable, and extensible. These goals minimize the barriers to entry and help motivate the community to produce a shared base of cyber security education tools that utilize the OCCP.

Summary of Accomplishments
The platform is designed to work with Virtual Machines (VMs) to simulate networked computers for a cyber challenge. Virtual machines are essentially a software emulation of a physical computer. These VMs run on a hypervisor, which manages a physical machine's resources to accommodate the virtual machine's resource requests. There are two main categories of hypervisors, often referred to as type one and type two. Type one hypervisors are installed directly on a physical machine; whereas type two hypervisors are installed as an application on top of an operating system.
Virtualization eases some of the logistical concerns of deploying and running cyber challenges. Though the hardware requirements of the hypervisor still must be considered, it will likely require fewer physical machines to run a challenge with virtual machines than it would to run the same challenge without virtualization.
Additionally, configuring a group of physical machines for a cyber challenge is difficult due to the financial cost, space requirements, and person hours required [3].
Furthermore, such a setup is inherently rigid in its configuration. Even slight modifications to the network topology potentially introduces new hardware and modifying the existing machines. Virtualization can make copying machines or changing the networking as simple as a few mouse clicks instead of relying on tools like Clonezilla [4] or FOG project [5] to clone hard drives.
While virtualization makes it easier to work with the machines involved in a cyber challenge, there are complicating elements to creating challenges that virtualization cannot not take away. For instance, whether virtual or physical, the machines must have an operating system installed. Then, each machine will still need additional configuration, such as the addition of user accounts, software, and other content. Some services may depend on the configuration of other machines and so great care must be taken to ensure that these elements match across the network. These problems are not solved by simply transitioning from physical machines to virtual ones.
The OCCP architecture has two core virtual machines known as the "Game Server" and "Administrative Virtual Machine". The Game Server concept is common with cyber challenges and is typically responsible for the scheduling and scor-

Target Audiences
The OCCP VM Builder considered three target audiences during its development.
• Advanced Contributor. This audience has the largest skill set and is accustomed to creating cyber challenges. They have a solid understanding of machine deployment, configuration, and the necessary elements of a cyber challenge. This audience can use and understand technical documentation for software they are not familiar with. They may also have experience maintaining production systems and using tools to support them like configuration management systems.
• Basic Contributor. This audience has less expertise than the advanced contributors, but is still familiar with the general work required to create a challenge. A Basic Contributor, given the description of a challenge, may be able to identify pieces that could be replaced or altered without destroying the functionality of the original challenge.
• Basic Administrator. This audience does not have the technical expertise of either of the Contributor audiences to produce a cyber challenge, but is capable of running an existing cyber challenge.
The OCCP VM Builder is designed to support all three audiences in creating and using cyber challenges within the scope of their abilities.
The Admin VM's extension allows the Advanced Contributors to share their work with the remaining audiences. The secondary audiences can produce functionally equivalent machines and reuse them by making configuration changes that, when done manually, is tedious and error prone. This project also provides challenge contributors with support mechanisms, such as those to produce randomized usernames, passwords and SSH keys, which are material, that are common to most machine configurations, but needlessly time consuming to produce manually. The end result of the OCCP VM Builder is a tool that allows challenge creators the ability to share their efforts with others who have varying technical backgrounds while also easing some of the burdens of creating challenges.

NICE Challenge Project
After the work of this thesis was completed, a similar platform known as the NICE Challenge Project [6] was discovered. Although information about this project is limited at this time, the available documentation seems to suggest that it includes a component with a similar purpose to the tool created by this thesis.
Specifically, their "environment modification engine" may make their challenges reusable by modifying content within them. However, since the project has not been released yet, it is difficult to determine how similar the tools are. According to their website, a beta release is scheduled for the beginning of 2015.

Related Technologies
This section will discuss the technologies related to this thesis project. Table 1 summarizes the information detailed in the subsections for each technology. My approach to extending the Admin VM was to make use of a CMS as the mechanism to configure machines. Unfortunately, these CMSs are not the perfect solution as they are specifically designed for configuration management and not provisioning or deployment. Additionally, using these systems directly, would force all users to learn the respective system to create and use challenges. Despite this, a CMS can be incorporated in to the Admin VM to fulfill some of this project's goals.
When my initial research into Puppet and Ansible started, Puppet supported Windows, whereas Ansible did not. Since then Ansible now partially supports Windows [11], but the support is still under development, this mature support for Windows led me to choose Puppet as the CMS used by the Admin VM.

Vagrant
Vagrant [9] is the most similar technology to what this project aimed to produce. Vagrant has the ability to provision virtual machines on several different hypervisors and can also make use of CMSs to configure machines. However, Vagrant has several drawbacks.
In order to provision a machine, Vagrant makes use of what it calls "providers".
Out of the box Vagrant comes with a free VirtualBox [12] provider, but the VMware provider must be paid for [13]. Since one of the goals of this project is to not introduce a financial cost and since the OCCP Admin VM was already able to work with VirtualBox or VMware free of charge, I decided not to use Vagrant.

Docker
Docker [10] is a tool by which users can package software and its dependencies to run on a wide variety of Linux systems. This is potentially useful in the creation of challenges because of its modular nature, but there are several drawbacks. One drawback is that Docker will only work with Linux. While Linux training is a vital part of security education, networks are often made up of a diverse set of operating systems, not just Linux. Furthermore, Docker works by using Linux containers that share a common kernel. In some cyber challenges a shared kernel could produce unwanted results. For instance, a vulnerability that exploits the kernel could cause all other Docker applications sharing that kernel to be exposed unintentionally. Finally, Docker is a fairly new tool and the makers even say "Please note Docker is currently under heavy development. It should not be used in production (yet)" [14] For these reasons I did not use Docker.

Setup Network
In order to facilitate the use of the CMS, I added the setup network capability to the Admin VM. This required the Admin VM to have two network interfaces.
The first interface was connected to a network that could reach the Internet. The second was attached to the setup network. Since only the Admin VM and the VMs being configured would be attached to this isolated network, the Admin VM could have full control over it. This design has the Admin VM as the gateway/router for the other VMs, and as a DHCP server. By running the DHCP server, the Admin VM can assign a known address to each VM under its control. Acting as a router, the Admin VM would forward any traffic to its other interface in order to reach the Internet. Figure 2 shows the new setup. This figure also depicts a "proxy cache", which is described in 3.5. The establishment of the setup network made it possible to use the CMS, however I enhanced the process with the introduction of configuration phases.

Configuration Phases
Regardless of how a VM is setup, the time/date it was setup can play a factor in its final state. For instance, if you take two identical VMs and install software on one, but wait a month to install that same software on the other with a package manager, you can end up with two different versions. For cyber challenges this can prove problematic since particular versions may have vulnerabilities that the challenge was intending to make use of. Furthermore, configuring software and other details about a VM can be relatively quick compared to the time it takes to download and install the software. For these reasons, I decided to separate software installation from the rest of the VM's configuration and do them in two phases.

The Installation Phase
This phase is strictly for the installation of software and other configurations that depend on when that software installation occurs. Items configured in this stage should be considered static and not easily reconfigured without starting from scratch. The Contributor creating the challenge should be the only person that applies this phase. The product of applying the Installation Phase is the state of the VM that will be shared when the Contributor publishes their challenge. This state is captured by the OCCP VM Builder as a VM snapshot called "phase1".

The Customization Phase
This phase is where the general configuration of the VM occurs.  Figure 3 shows this process and the audiences that would be involved at each step.

Regeneration
Because the Customization Phase is where all of the dynamic configuration occurs, changes can be applied by reverting to the phase1 snapshot and applying the Customization Phase again. Instead of requiring users to do this by hand, I introduce a "regeneration" flag to the OCCP VM Builder. The regeneration process is summarized in Figure 4.  In order to maintain flexibility, my extension still allows Contributors to create VMs by hand. The only additional requirement is that they must manually take a snapshot called phase2. This is because the OCCP VM Builder now considers the phase2 snapshot as the state of the VM that is ready for use in a challenge.
Since manually built VMs do not use Content Packs, there is no point to manually create the snapshot that would have been taken at the end of the Installation Phase. There would be nothing for the OCCP VM Builder to do between the phase1 and phase2 snapshots, therefore manually built VMs must not have a phase1 snapshot. As shown in Figure 4, VMs configured by hand will simply be reverted to the manually takenphase2 snapshot during regeneration. If, however, a VM has a phase1 snapshot and is therefore eligible for regeneration, it will be reverted that snapshot. The VMs current phase2 snapshot will be removed and the Customization Phase will be reapplied. The process of regeneration is most useful when the scenario XML file makes use of OCCP Variables and OCCP Generators which are described in 3.1.4.

OCCP Variables and Generators 3.1.4.1 Variables
My OCCP VM Builder also added variables to the scenario XML file specification. The variables allow for a single editable location to affect one or more places throughout the scenario XML file. For instance, a variable could hold a username that is used throughout the rest of the scenario file. All of the target audiences should be able to update a variable with ease since it only involves changing the value in one location. Furthermore, before this introduction of variables, they would have had to update their scenario XML file by hand in a similar fashion.

Arrays
In order to provide greater functionality to variables, I also introduced variable arrays. The arrays can contain multiple elements per variable which can then be referenced by their index (location) in the array. Though arrays are more complex than a simple variable, editing an element is no more complex than editing a single variable. Arrays can provide greater quantities of related data more conveniently than multiple variables to Content Packs. An example usage of an OCCP array would be to hold multiple usernames and then reference them throughout the scenario file with their index, or to pass them to a Content Pack all at once.

Generators
In addition to these variables, I also introduced generators that fill in the content of the variables and arrays. These generators can take parameters that affect their output. My design called for three proof of concept generators a username, password, and SSH key generator.

Username Generator
The username generator reads from a comma separated value file (CSV) that contains the first name, last name, and a username on each line, to produce randomized usernames. The generator randomly picks lines from this file and creates variables for each of the values. The user can optionally specify a count to generate more than one set with the guarantee that they will not get duplicate usernames provided there are no duplicate entries in the CSV file. When using a count greater than one, the variables generated will be arrays.

SSH Key Generator
The SSH key generator produces the public and private parts of an SSH key in separate variables. Like the username generator, a count may also be specified to create several key pairs at once as arrays. Optionally, a password may be provided to the generator to encrypt the private key.

Password Generator
The password generator is the most complex of the three generators that I built because it can take a variety of parameters. The generator's basic functionality produces the encrypted form of a plain text password, which may be provided to it as a parameter or randomly generated. One plain text password may be provided directly, or a file containing one plain text password per line may be provided for the generator to choose from at random. The user can provide additional parameters, such as length and type, for the generator to use to create random plain text. The user may specify SHA256, SHA512, or MD5 as the encryption algorithm for greater flexibility. Like the username and SSH key generator, a count can be specified but will be ignored if a plain text password was provided directly.

Summary of Variables and Generators
Variables and generators are important to regeneration because, in addition to properly reverting the VMs and reapplying Phase Two, generators will regenerate their content. Depending on how the variables and generators were used, the result is a functionally equivalent VSN, but with slightly different content. One use case for this would be an instructor teaching multiple sections of the same class and wishing to have the same environment, but different content for each section. This would prevent students exposed to the challenge before other sections from being able to pass along the passwords or flags that they had discovered. Furthermore, an instructor could conduct a challenge early on in the course and then again at the end of the course, varying the content so that it is not immediately obvious to students that they are participating in the same challenge as before.

Reports
With the addition of dynamic content from generators and variables, there needed to be a mechanism to collect the generated information when the Admin VM was done building the VSN. To accomplish this I introduced the report XML tag to the scenario XML file specification. This tag allows the Contributor to specify any text they wish and reference any OCCP variable in the scenario XML file. The OCCP VM Builder stores all the reports in a predictable location and makes them available to Content Packs during setup. A report could be as simple as a list of variables and their values, or detailed instructions that can contain the dynamic content from variables. A Content Pack can transfer the report to a VSN VM, which would be useful to distribute instructions to player VMs.

The Process of Creating a New Challenge
The following process of creating an OCCP challenge with the OCCP VM Builder is illustrated in Figure 5. First, the Contributor picks or creates the Base VMs that their VSN will require. Their choice in Base VM determines what operating systems will be present in the final VSN. Next, the Contributor chooses or creates Content Packs that will configure the Base VMs to produce the final VM for the VSN. Next, the Contributor writes a scenario XML file that describes their VSN. They can use variables and generators to make this description more dynamic and configurable. The scenario XML file can pass these variables to the Content Packs, which will cause the Admin VM to configure their VMs automatically.
Finally, the Contributor uses the OCCP VM Builder to build their VSN and run their challenge. When they are satisfied that the VSN is ready for others to use, the Contributor uses the OCCP VM Builder to package up all the required files to distribute their challenge.

Export Mode
Once a challenge has been built and is ready to be shared with the OCCP community, the Contributor can make use of the export mode. Most hypervisors provide export functionality that will prepare a VM to be brought to another hypervisor. With OCCP VM Builder's introduction of the Configuration Phases and dynamic content from generators, additional consideration is needed before exporting the VMs.
Since the export mode can take significant time to complete, the OCCP VM Builder first checks for any issues that could cause the process to fail later on.
The following logic is also illustrated in Figure 6 on Page 27. First, the program ensures that all the VMs in the scenario XML file exist and that they are eligible for export. Some VMs do not make sense to export and are therefore ineligible.
For instance, the OCCP Router VM, which is a standard part of each OCCP installation, is not available for export because every other OCCP installation will already have all of the elements required to reproduce it without importing it.
Next, the program determines if the VM is capable of regeneration by the existence of a phase1 snapshot. If the VM is capable, the program will revert the VM to the phase1 snapshot and move on to the next VM. If the VM does not have a phase1 snapshot, the program will try to revert to the phase2 snapshot. After the OCCP VM Builder is finished checking each VM, if there were no errors it will begin the export process. If there were errors, it will report all the errors that it encountered to the user and stop. Reporting these errors at this point saves the user from having to discover them as they happen which could be several minutes or hours in to the process depending on their hypervisor and the size of the VMs.
The reason the export process prefers phase1 snapshots is because that is the only state from which the VM can be re-configured. For most hypervisors, the export function cannot retain snapshots, they can only export one state of the VM.

SSL Certificates
Another hurdle to using Puppet was its use of SSL certificates. Communications between agents and the master are secured with SSL in Puppet, which allows for secure communication and trust that the other party is who they claim to be. This use of SSL certificates prevents a bad actor from sniffing traffic or impersonating a client to gain information about a Puppet catalog or manifest.
This system works well in environments where agents are added by IT staff and will remain under the master's control for their life cycle. However, OCCP VSN VMs are not being added by a human being, nor is their life cycle particularly long. Furthermore, they will be shared with another Puppet environment if the author chooses to share their challenge.

Agent & Master Communication
Typically, a new client will perform the following procedure when attempting to contact the master. First, the client will generate its own private key if it doesn't have one. Next, it will request a copy of the Certificate Authority file from the master if it doesn't already have it. Next, the client will attempt to retrieve its signed certificate from the master. If it doesn't get one, it will determine if it has already generated a Certificate Signing Request, and if not, it will do so. With default settings, the client will have to wait until a user completes the signing request on the master. In normal environments, the user would check the fingerprint of client's certificate and ensure it matches the request received on the master before signing the request. Due to the nature of the OCCP setup environment, the manual signing of certificates is unfavorable. If the setup process required human intervention to function, it would increase the complexity for OCCP users.
Not only would they have to be involved in the setup process, they would have to know how Puppet's certificate system worked. To automate this process, the OCCP Puppet master is configured to automatically sign every Certificate Signing Request it receives. While this solves the certificate signing issue, other issues still remained.
Because of the SSL system, bringing a VM from one Puppet infrastructure to another with a different master would cause problems. One of the features of using SSL is that the agent and master can verify the other party is who they claim to be. If the agent is brought to a new environment with a new master, its certificates will not match. To prevent this and increase the ability to share OCCP material amongst the target audiences, the Admin VM needed to ensure that all certificate material was removed. Before the Admin VM program gets an agent to check in, it first revokes any certificate for that VM. Secondly, at the end of each configuration phase, it will remove certificate material on the agent. Unfortunately, this means that the agent will need to go through all the work of an initial check in, but it will never encounter a mismatched certificate. This allows Contributors to share their OCCP challenge with other OCCP users without issue.

Design a Virtual Network for a Cyber Challenge
Since one of the motivations behind my work was to provide tools and mechanisms for a community to build challenges and contribute back, there needed to be some initial content from which the community could start. Since network defense is a popular challenge type that can teach and assess valuable cyber security skills, it was a natural choice to provide as an example. However, before I created the network defense challenge, I first created a generic reference VSN that modeled a typical small business network.

The OCCP Reference Virtual Scenario Network
The reference VSN includes several VMs running services typical of small business networks. While its generic configuration is not suitable for all challenges, it gives a working example that can be modified to create different VSN topologies. Figure 7 shows the reference VSN's network topology. At its core are five VMs: a firewall, a database, a file server, a mail server, and a web server.

Network Defense Virtual Scenario Network
To create the Network Defense scenario I started with the creation of the Content Packs. Since the only Content Packs that existed were the ones that I wrote for the reference VSN, I started by copying them. This resulted in the same network topology as the reference VSN. Because there were many services used in the reference, there was a lot of flexibility for the types of elements to include in a network defense context. Basing my design loosely on some of the concepts taught in URI's cyber security courses, I decided to include a vulnerable web application. After searching Rapid7's exploit database [3], I discovered a vulnerable web application that was a good fit for exercising skills taught in URI's classes.
Based on those skill sets, I did not expect participants to patch or remove the application, but there were steps they could take to mitigate the threat by fixing file permissions and fixing intentionally poorly-written PHP on another page of the website. That page would connect to the database with root's credentials which is a poor security practice. Another vulnerability that I added was a weak SSH configuration on the publicly facing servers. Due to the nature of this weakness, there were several ways for the participant to mitigate the threat. For instance, they could modify the SSH configuration on each exposed VM, configure local firewall rules, or configure the firewall VM to mitigate the threat. Lastly, I intentionally configured the firewall to be too permissive with some rules for traffic between the DMZ and LAN. Instead of only allowing the essential traffic from the DMZ to the protected LAN, the firewall allowed some non-essential traffic through as well.
The idea is that the participant could revise the firewall rules to be more secure by further restricting the allowed traffic. Another security weakness that was already present in the reference VSN was the use of anonymous FTP on the file server. The only "protection" the file server has is its network location and that the computers that can access it are supposed to be trustworthy. This is a poor security model with many ways that a participant could mitigate it. For instance, they could try to prevent an attacker from penetrating the LAN by securing the firewall or they could implement an authentication system for accessing the file server. Ultimately, the goal is that the participant should recognize that FTP is potentially dangerous and be able to explain why another technology would be a better choice for the file server.
Overall, if the participant did nothing to defend their network, an attacker could start by exploiting the vulnerable web application. Through that exploit the attacker could discover root's credentials to the database in the poorly-written PHP script. With root's credentials they could dump all the databases, one of which has email account names and passwords. After some password cracking and due to inappropriate firewall rules, the attacker could pivot from the web server to the database server located in the LAN. From this network position, the attacker has access to all of the network segments.
To create this challenge I was able to reuse, without modification, six of the ten reference VSN's Content Packs. I was then able to reuse, with slight modification, the remaining Content Packs. I did have to create an additional Content Pack which added smaller configuration details like user accounts and other miscellaneous configurations. As more challenges are built in this way, there will be more Content Packs to choose from, which should ease the burden of composing new challenges.

Incorporating a caching proxy
Though not part of my original design, the need for a caching proxy that could expedite the downloading of software was shown during the development of the OCCP VM Builder. When developing and testing the Installation Phase components of the Content Packs, I was downloading the same packages each time.
Instead of going out to the Internet to download these packages several times, a caching proxy is capable of downloading it once and serving requests for that content locally. So, I added the Squid Proxy Server [4] to the OCCP VM Builder.
As seen in Figure 2, the VMs in the setup network have all of their traffic routed through the OCCP VM Builder's Admin VM. The caching proxy examines their requests and determines if it already has the content that is being requested. If it does, it serves the content itself. If it doesn't have the content in its cache, the proxy fetches the content and caches the content so that it may serve the content locally the next time that the content is requested. Time to complete was also used in the comparison.

Experiment 4: Can the username generator generate content properly?
This experiment showed that the username generator can produce content that functions correctly on VMs created by the OCCP VM Builder's Admin VM.
I wrote scenario files that used the username generator in the following ways: • Specified a properly-formatted name CSV file and a name count of one. If functioning correctly, this would produce three variables, one with the first name, one with the last name, and one with the username from a random line in the CSV file.
• Specified a properly-formatted name CSV file and a name count of five. If functioning correctly this would produce three arrays containing five ele-ments, one with the first names, one with the last names, and one with the usernames. Since there were no duplicate lines in the CSV file, the outputs should not contain duplicates.
• Specified a properly-formatted name CSV file and a name count greater than the number of usernames in the CSV. If functioning correctly, this would produce an error.
• Specified a non-existent name CSV file and a name count of one. If functioning correctly, this would produce an error.
• Specified a non-existent name CSV file and a name count greater than one.
If functioning correctly, this would produce an error.

Experiment 5: Can the password generator generate functional content for challenges?
This experiment showed that the password generator is not only capable of producing content, but also that the content produced is functional in VMs produced by the OCCP VM Builder's Admin VM. By providing different sets of parameters to the generator, then assigning its output to users, I tested that I could successfully login with the expected password from each generated case.

Algorithm Tests
This series of tests ensured that the algorithms are producing the correct content for the encrypted version of the password. Each output was assigned to a different user. A successful login for each user determined that the content generated was consistent with the algorithm specified. I employed these variations: • Set the password to "password" and set the algorithm to "MD5" • Set the password to "password" and set the algorithm to "SHA256" • Set the password to "password" and set the algorithm to "SHA512"

Additional Parameter Tests
This series of tests ensured that the generator can produce the plain text password correctly from the given parameters. Visual inspection confirmed that the results match the parameters provided. The count was set to 5, and the length set to 25 for each test. I varied the type parameter among: • "ASCII" -If functioning correctly, this would produce an array of five passwords, twenty five characters long.
• "Alpha" -If functioning correctly, this would produce an array of five passwords, twenty five characters long containing only alphabetic characters. Any other characters would indicate failure.
• "AlphaNumeric" -If functioning correctly, this would produce an array of five passwords, twenty five characters long containing only alphabetic and numeric characters. Any other characters would indicate failure.

Password Pool File Correctness Tests
This series of tests ensured that the generator can produce content from a specified pool of plain text passwords correctly. To accomplish this, I performed the following: • I used a valid password pool file, and specified a count of one. The expected output was a variable containing the plain text password chosen, and one containing its encrypted version.
• I used a valid password pool file, and specified a count of four. I expected an array containing the plain text passwords chosen, and another array containing the encrypted versions.
• I used a valid password pool file, and specified a count greater than the number of passwords available in the pool file. I expected an error for this test.
• I specified a non-existent password pool file, and a count of one. I expected an error from this test.
• I specified a non-existent password pool file, and a count greater than one. I also expected an error from this test.
3.6.6 Experiment 6: Can the SSH key generator generate functional content for challenges?
This experiment showed that the SSH key generator can produce content that is functional in VMs created by the OCCP VM Builder's Admin VM. I used the generator to produce the SSH keys and tested them in VMs configured to allow SSH key authentication. A successful login determined the success of this experiment. I produced a key pair without a password, and one with a password of "password".  Unfortunately, the export and import process can take a significant amount of time to accomplish. There are a number of different factors contributing to this, many of which will vary greatly depending on the OCCP configuration and hypervisors involved. Though the process can take a significant amount of time,

List of References
the Contributor should only ever have to export once. Likewise the end users of the challenge will only have to import once. Regeneration after the first deploy of a scenario is much faster because there are no VMs to import, which eliminates a large portion of time required in the initial deployment.

Experiment 3:
Can the extended Admin VM produce reusable VMs for the OCCP, and how effective is this new method?
This experiment was designed to show that the OCCP VM Builder's Admin VM is capable of creating reusable VMs by changing the original content. In Experiment 1 I created my scenario XML file to match the usernames and passwords that I had used in the manual setup. Here in Experiment 3, I instead used generators to create the usernames and passwords for the seven user accounts, postfix's database password, local root password, and MySQL root password. Though the content that was generated did not match the content I used in the manual reconfiguration, the VMs are functionally equivalent. I collected the same metrics from Experiment 1 and summarized them in Table 4.3.

Summary
The OCCP VM Builder's Admin VM was able to produce VMs whose content differed from their original configuration. Table 3 compares the effort required to reuse the VMs with the OCCP VM Builder versus the manual method. Though not shown in the table, there are additional benefits to using the automated method.
By using generators, the regeneration flag can produce functionally equivalent VMs with new content with just a single command. Moreover, there is no chance of mistyping or forgetting to change something on any of the VMs. This would seem to suggest that the OCCP VM Builder is an improvement over manually creating VMs in terms of reuseability.

Experiment 4 Results
: Can the username generator generate content properly?
The results of this experiment show that the username generator functions correctly. Before beginning the experiment, I created a CSV file with fifty lines of the form "first name, last name, username" with no duplicates. This is the file that I used for tests requiring a valid CSV file.
The following XML fragment caused the generator to produce the expected result of creating three variables for the first name, last name, and username.
<var name=" t e s t 1 " g e n e r a t o r=" username "> <param name="names">validNames . c s v</param> <param name=" count ">1</param> </ var> The following XML fragment caused the generator to produce the expected result of creating three arrays for the first name, last name, and username, each containing 5 values.

</ var>
Since there were only fifty usernames available in the CSV, requesting one hundred caused an expected error. As such, this test was successful.
The following XML fragment caused the generator to produce an error as expected. The CSV file specified did not exist and therefore could not be used to generate any usernames.

</ var>
Since the admin program was unable to locate the CSV file it produced an expected error. As such, this test was successful.
The following XML fragment caused the generator to produce an error as expected. The CSV file specified still did not exist and the count parameter should have had no affect on the production of an error message.

</ var>
Since the admin program was unable to locate the CSV file it produced an expected error regardless of the count parameter provided. As such, this test was successful.

Summary
Because each of the tests was successful, overall the experiment was a success.
The experiment set out to prove that the username could generate content properly which it has so demonstrated. This generator has been shown to be functioning as intended and is suitable for use by challenge Basic Contributors.

Experiment 5 Results
: Can the password generator generate functional content for challenges?
Due to the multiple functions of this generator, this experiment's tests were broken down in to different categories. Each category tested a smaller unit that, when combined, ensured the correctness of the overall generator.

Algorithm Tests
Before beginning these tests, I wrote a Content Pack that would configure one VM to set the generated passwords as user passwords on the it. The following XML fragment was used in conjunction with the Content Pack.
<var name="md5" g e n e r a t o r=" password "> <param name=" password ">password</param> <param name=" a l g o r i t h m ">MD5</param> </ var> <var name=" sha256 " g e n e r a t o r=" password "> <param name=" password ">password</param> <param name=" a l g o r i t h m ">SHA256</param> </ var> <var name=" sha512 " g e n e r a t o r=" password "> <param name=" password ">password</param> <param name=" a l g o r i t h m ">SHA512</param> </ var> Because the actual values can be quite long, I've only included the first fifteen characters. Visual inspection shows that each of the encrypted versions used the correct algorithm. The number between the first two dollar signs indicates which encryption algorithm was used. $1$ signifies MD5, $5$ signifies SHA256, and $6$ signifies SHA512. The next set of characters between dollar signs is the salt used to calculate the hash; the hash itself follows the last dollar sign. Both the hash and salt could change if this generator were run again under with the same XML, but the algorithm number should not change. The first part of this test showed that the expected structure was generated, the second part shows that the output is functional when used on real VMs.
The Content Pack used the outputs to set the password for a user called "md5", "sha256", and "sha512". I was able to successfully able to login as each of these users on an Ubuntu 14.04 server VM using the plain text password "password" as expected.
Since the produced content of this generator matched the expected form and was could be used as a password on a real VM these tests were successful.

Additional Parameter Tests
<var name=" a s c i i " g e n e r a t o r=" password "> <param name=" count ">5</param> <param name=" l e n g t h ">25</param> <param name=" type ">ASCII</param> </ var> <var name=" alpha " g e n e r a t o r=" password "> <param name=" count ">5</param> <param name=" l e n g t h ">25</param> <param name=" type ">Alpha</param> </ var> <var name="alphanum" g e n e r a t o r=" password "> <param name=" count ">5</param> <param name=" l e n g t h ">25</param> <param name=" type ">AlphaNumeric</param> </ var> As Table 7 shows, the expected arrays were generated and were filled with expected content. This experiment demonstrates that the generator is capable of responding to the type parameter. The output in Table 7 does not show any errors $&&*UG8!Q9BnSE 46*ZIldkrl and is indeed random in nature. It could just be that no invalid random characters were selected during these tests. As such this test was successful.

Password Pool File Correctness Tests
Before beginning this test, I created a password pool file that contained ninety passwords each on their own line of the file. The passwords were referenced from Mark Burnett's top ten thousand most commonly used password list [1]. This is the file I used for each test requiring a valid password pool file.
The following XML fragment was used for the first test which should have generated one password randomly chosen from the password pool file.

</ var>
As expected, the generator produced one password, "cheese", from the password pool file. As such this test was successful.
The following XML was used for the second test which should have generated an array of four passwords randomly chosen from the password pool file.
<var name=" t e s t 2 " g e n e r a t o r=" password "> <param name=" count ">4</param> <param name=" p o o l ">passwordPool . t x t</param> </ var> As expected, the generator produced an array of four passwords from the password pool file. Table 8 shows the content that was produced by the generator. Since this matches the expected output, this test was successful.

Summary
Considering that each of the tests in this experiment were successful, the experiment overall was successful. This generator is capable of producing passwords that function on real VMs according to the parameters used. This experiment demonstrates that the generator behaves as intended for various inputs and can be used by Basic Contributors for their challenges.
4.6 Experiment 6 Results: Can the SSH key generator generate functional content for challenges?
Before beginning this experiment I wrote a Content Pack that would take the keys generated and configure two VMs to make use of them. If the generator performed correctly, running the VMs would show proper functionality.
The following XML produced the public and private parts of an SSH key pair.
No password was provided to the generator so the private key is not encrypted.
<var name=" p l a i n s s h " g e n e r a t o r=" s s h k e y "> <param name=" count ">1</param> </ var> I was able to login via ssh by explicitly choosing the generated private key to the second VM. This demonstrated that the public and private parts of the key pair were generated correctly and functioned as intended on real VMs. As such, this test was considered a success.
The following XML fragment produced the public and private parts of an SSH key pair. Because a password was provided, the private key was encrypted.
<var name=" p a s s w o r d s s h " g e n e r a t o r=" s s h k e y "> <param name=" count ">1</param> <param name=" password ">password</param> </ var> Again, I was able to login via ssh by explicitly choosing the generated private key to the second VM. Since the private key was encrypted, I had to provide the password to the key before it could be used. Because I successfully logged in, it was demonstrated that the public and private parts of the key pair were generated correctly and functioning as intended on real VMs. As such, this test was also a success.

Summary
Since each test in this experiment was successful, the experiment overall was successful. The experiment demonstrated that the SSH key generator was able to produce SSH key pairs that functioned as intended on VMs. Contributors can therefore use this generator for their challenges.

Conclusions
This section discusses how the goals of this thesis were met by summarizing the work performed and the experimentation results.

Goal 3: Avoid the introduction of a financial cost
This goal was an underlying goal of the OCCP itself. The motivation of this goal was to minimize the barriers to entry to the platform thereby making wider adoption easier. If the work of this thesis had introduced financial cost with its addition, it would have compromised the value of this work and the OCCP in general. The technologies used in this thesis are all available free of charge. Though some technologies used have editions that must be bought, those editions were not needed or used. As such this goal has been met since none of the technologies used require a financial cost to make use of them.

Goal 4 Conclusions
Goal 4: The extended Admin VM should provide a distribution mechanism with reasonable expectations of ability for the target audiences The export logic expressed in Figure 6 is used by the export mode to produce files required by another Admin VM instance to reproduce the VSN. Experiment 2 tested that the export and import process produced functionally equivalent VSNs.
The Advanced and Basic Contributors simply use the export mode to bundle the required files they would need to distribute to share their creation. The Basic Administrators then can use the deploy or launch mode to import the foreign VSN. Since the Basic Administrator does not have to intervene in this process or manipulate any files, this is reasonable for them to operate. If they were incapable of this, then they were incapable of using the OCCP before the OCCP VM Builder work was introduced. Likewise, the Advanced and Basic Contributors do not have to do any additional work other than running export mode. The OCCP VM Builder's Admin VM does all the decision making for them. Their technical abilities would allow them to change the mode argument to the program to export.
The alternative would be for them to follow the same logic that the Admin VM now uses and then to manually revert and export the correct VMs. As such, the export mode logic makes it easier for Contributors to prepare their work for distribution. Even with the work of this thesis, creating challenges can still be a time consuming and difficult task. Instead of creating the VMs manually, someone still must produce the content packs required to configure the machines. In some instances content packs can be reused. In other circumstances the content packs must be created from scratch. Though it can be time-consuming to create content packs, the content packs provide the benefit of reuse. Trying to reconfigure a VM that was configured manually can be tedious and error prone with limited return on investment. With careful creation of content packs and the use of variables and generators, the effort invested can be reused. Furthermore, the work can be shared with other users of the OCCP. The ability to share and reuse work, despite the effort required to make the original version, is valuable to the cyber security education community.

List of References
Although operating system diversity was a consideration when choosing the configuration management system, free Linux distributions were the focus of my tests. This was largely due to licensing restrictions imposed by Microsoft and Apple for their operating systems. Puppet was chosen because it was compatible with all the major operating systems so there is nothing immediately obvious preventing licensed operating systems like Windows or Mac OS X from being used with the extended Admin VM with future work.

Future Work
This section discusses possible future work that could improve the work of this thesis.

Licensed Operating Systems
As previously stated, licensed operating systems such as Windows and Mac OS X were considered, but largely untested. The major hurdle to using Mac OS X with this work is the license agreement disallows installing it on non-Apple hardware, which limits the hypervisors that can be used. Though Microsoft Windows does not have such installation restrictions, its activation requirements are more strict.
Another hurdle that would need to be worked out is remotely controlling Puppet.
Remote command execution is not as easy with Windows as it is with Unix-based operating systems. Finally, these systems cannot be shared due to their licensing.
Base VMs or scenarios using these operating systems could not be distributed as is. Instead instructions for creating these bases would need to be provided. Future work could determine the best practices for overcoming these challenges.

Automated Installations
Base VMs were created to have a common starting point and simplify some aspects of the this work. By cloning a machine with the operating system already installed, the user saves the time it takes to install the operating system each time they make use of that Base VM. The draw back is that the user is limited to the available Base VMs or must make their own. It may be possible with future work for the Admin VM to automatically create Base VM's from installation media or ISOs. Some operating system vendors provide systems that can provide answers to the questions the installer would ask during installation. With this mechanism it could be possible to automatically produce Base VMs. Since the systems varied from distribution, attempting to include this automated installation was beyond the scope of this thesis project and would simply be a convenience from having to create a Base VM by hand.

Additional Generators
More generators would provide challenge Contributors with more tools to work with and more variation during regeneration. A network generator would be useful for challenges to initially appear different despite having the same network topology. It should be possible to have users be able to request networks capable of supporting at least the number of machines required. Users could also specify if the network should be publicly routable, or one of the reserved for internaluse networks. The generator would generate the appropriate elements such as the subnet mask, gateway address, and broadcast address in addition to an array of all the valid IP addresses for the network. Creating this generator is complicated due to various reserved networks and sanity checking requirements but is still feasible to produce with additional time.

Further Evaluation
In order to draw more solid conclusions on the effectiveness and utility of the OCCP VM Builder, more testing and evaluation should be performed. In addition to a larger group of testers, having the testers repeat the experiments with multiple challenge designs could be useful. Although different designs should produce varied results, since the work required to implement them is different, it could help validate the results. It would show that the OCCP VM Builder is applicable to different challenges, and give a wider sample size. The challenge designs must be descriptive in order for testers to be able to produce the challenge.
The design should describe each VM involved with the following: • What software should be installed.
• How the software is configured differently from the default settings.
• What user accounts should be present.
• What the account passwords should be set to, or describe set of acceptable passwords if it can be randomized.
• What network(s) the VM is connected to.
• What content needs to be transferred.
• What the Unix ownership and permissions should be for content.

Accomplishments
Previous to the work of this thesis project, there did not exist a tool by which reusable virtual machines for cyber challenges. By extending the already free OCCP in a way that did not introduce a financial cost, there now exists such a tool. The effort exerted by challenge creators can now be reused by themselves and others when they share their challenges with the OCCP user base. Variables and generators can make functionally equivalent, but slightly different VSNs which has great benefits to the education community. With greater adoption of the OCCP, the work of this thesis should ease the burdens of creating new challenges.