feat: 🎸 separate prompts from script

Now prompts under `prompts`. Also some minor bugs are fixed.

 Closes: #5
This commit is contained in:
Grey_D 2023-04-10 13:38:33 +08:00
parent 45ce3cccd6
commit 8d3a863db0
7 changed files with 207 additions and 148 deletions

1
.gitignore vendored
View File

@ -6,6 +6,7 @@ __pycache__/
config/
outputs/
.idea
log/
# C extensions
*.so

View File

@ -1,5 +1,6 @@
## Design Documentation for PentestGPT
version 0.1, for web penetration testing only
The current design is mainly for web penetration testing
### General Design
PentestGPT provides a unified terminal input handler, and backed by three main components:
- A test generation module which generates the exact penetration testing commands or operations for the users to execute.
@ -16,26 +17,31 @@ The handler is the main entry point of the penetration testing tool. It allows p
2. Pass a webpage content.
3. Pass a human description.
### System Design
#### General Structure
1. Maintain three chat sessions in one class. Each session is for one component.
2. User can select to pass information to one section. In particular.
1. todo.
2. pass information.
#### Logic Flow Design
1. User initializes all the sessions.
2. User follows the instruction to perform penetration testing in the following logic.
1. Reasoning session provides a list of todo for user to choose from.
2. User chooses one todo, and reasoning session passes the information to generation session.
3. Generation session generates the exact command for the user to execute.
4. User executes the command, and passes the output to the parsing session.
5. Parsing session parses the output, and passes the information to reasoning session.
6. Reasoning session updates the todo list, and provides the next todo for the user to choose from.
3. User can also pass information to the reasoning session directly.
1. User passes the information to parser session.
2. Parser session summarizes the information, and passes the information to the reasoning section.
1. User initializes all the sessions. (**prompt**)
2. User initializes the task by
1. **User** provides the target information to the **ReasoningSession**.
2. The **ReasoningSession** generates a *task-tree* based on the target information.
3. The **ReasoningSession** decides the first todo, and passes the information to the **GenerationSession**.
4. The **GenerationSession** generates the exact command for the user to execute, and passes it to the **User**.
3. Go into the main loop. The **User** can pick to:
1. Provide todo execution results to PentestGPT.
1. The **User** provides the output of the tool to the **ParsingSession**.
2. The **ParsingSession** parses the output, and passes the information to the **ReasoningSession**.
3. The **ReasoningSession** updates the *task-tree* based on the information.
4. Do step 3.2.1-3.2.3
2. Ask for todos.
1. The **ReasoningSession** analyzes the *task-tree*. It decides the next todo, including (1) a natural language description, and (2) the exact command to execute.
2. The **ReasoningSession** passes the information to the **GenerationSession** for further verification.
3. The **GenerationSession** generates the exact command for the user to execute, and passes it to the **User**.
3. Discuss with PentestGPT by providing arbitrary information.
1. The **User** provides the information to the **ParsingSession**.
2. The **ParsingSession** parses the information:
- If it is too long, summarize it.
- Otherwise, just rephrase it.
3. The **ReasoningSession** analyzes the information, and updates the *task-tree*.
- Exit the program.
A flow-chart is shown below:
@ -46,21 +52,34 @@ sequenceDiagram
participant GenerationSession
participant ParsingSession
User->>ReasoningSession: 1.1 Choose todo
ReasoningSession->>GenerationSession: 1.2 Pass information
GenerationSession->>User: 1.3 Generate command
User->>ParsingSession: 1.4 Execute command
ParsingSession->>User: 1.5 Parse output
User->>ReasoningSession: 1.6 Pass output
ReasoningSession->>ParsingSession: 1.7 Update todo list
ParsingSession->>ReasoningSession: 1.8 Pass information
ReasoningSession->>User: 1.9 Provide list of todo
ReasoningSession->>User: 1.10 Check if process is finished
User->>+ReasoningSession: 1.1 Provides target information
ReasoningSession->>+ReasoningSession: 2.1 Generates task-tree
ReasoningSession->>+GenerationSession: 2.2 Decides first todo
GenerationSession->>+User: 2.3 Generates command
loop Main Loop
User->>+ParsingSession: 3.1 Provides todo execution results or arbitrary information
alt Provides todo execution results
ParsingSession->>+ReasoningSession: 3.2 Parses output
ReasoningSession->>+ReasoningSession: 3.3 Updates task-tree
ReasoningSession->>+GenerationSession: 3.4 Analyzes task-tree for next todo
GenerationSession->>+User: 3.5 Generates command
else Asks for todos
ReasoningSession->>+ReasoningSession: 3.2 Analyzes task-tree
ReasoningSession->>+GenerationSession: 3.3 Decides next todo
GenerationSession->>+User: 3.4 Generates command
else Discusses with PentestGPT
ParsingSession->>+ReasoningSession: 3.2 Parses information
opt Information is too long
ParsingSession->>+ParsingSession: 3.2.1 Summarizes information
end
ReasoningSession->>+ReasoningSession: 3.3 Analyzes information
end
User->>-ParsingSession: 3.1 Provides todo execution results or arbitrary information
end
User->>-PentestGPT: 4. Exit
User-->>ParsingSession: 2.1 Pass information
ParsingSession-->>User: 2.2 Summarize information
User-->>ReasoningSession: 2.3 Pass summarized information
```
#### Wrapper Design
#### Session Design
#### Prompts
The prompts are stored in the `prompts/prompt_class.py`.

View File

@ -45,24 +45,32 @@ The handler is the main entry point of the penetration testing tool. It allows p
### System Design
#### General Structure
1. Maintain three chat sessions in one class. Each session is for one component.
2. User can select to pass information to one section. In particular.
1. todo.
2. pass information.
#### Logic Flow Design
1. User initializes all the sessions.
2. User follows the instruction to perform penetration testing in the following logic.
1. Reasoning session provides a list of todo for user to choose from.
2. User chooses one todo, and reasoning session passes the information to generation session.
3. Generation session generates the exact command for the user to execute.
4. User executes the command, and passes the output to the parsing session.
5. Parsing session parses the output, and passes the information to reasoning session.
6. Reasoning session updates the todo list, and provides the next todo for the user to choose from.
3. User can also pass information to the reasoning session directly.
1. User passes the information to parser session.
2. Parser session summarizes the information, and passes the information to the reasoning section.
1. User initializes all the sessions. (**prompt**)
2. User initializes the task by
1. **User** provides the target information to the **ReasoningSession**.
2. The **ReasoningSession** generates a *task-tree* based on the target information.
3. The **ReasoningSession** decides the first todo, and passes the information to the **GenerationSession**.
4. The **GenerationSession** generates the exact command for the user to execute, and passes it to the **User**.
3. Go into the main loop. The **User** can pick to:
1. Provide todo execution results to PentestGPT.
1. The **User** provides the output of the tool to the **ParsingSession**.
2. The **ParsingSession** parses the output, and passes the information to the **ReasoningSession**.
3. The **ReasoningSession** updates the *task-tree* based on the information.
4. Do step 3.2.1-3.2.3
2. Ask for todos.
1. The **ReasoningSession** analyzes the *task-tree*. It decides the next todo, including (1) a natural language description, and (2) the exact command to execute.
2. The **ReasoningSession** passes the information to the **GenerationSession** for further verification.
3. The **GenerationSession** generates the exact command for the user to execute, and passes it to the **User**.
3. Discuss with PentestGPT by providing arbitrary information.
1. The **User** provides the information to the **ParsingSession**.
2. The **ParsingSession** parses the information:
- If it is too long, summarize it.
- Otherwise, just rephrase it.
3. The **ReasoningSession** analyzes the information, and updates the *task-tree*.
- Exit the program.
A flow-chart is shown below:
@ -73,26 +81,36 @@ sequenceDiagram
participant GenerationSession
participant ParsingSession
User->>ReasoningSession: 1.1 Choose todo
ReasoningSession->>GenerationSession: 1.2 Pass information
GenerationSession->>User: 1.3 Generate command
User->>ParsingSession: 1.4 Execute command
ParsingSession->>User: 1.5 Parse output
User->>ReasoningSession: 1.6 Pass output
ReasoningSession->>ParsingSession: 1.7 Update todo list
ParsingSession->>ReasoningSession: 1.8 Pass information
ReasoningSession->>User: 1.9 Provide list of todo
ReasoningSession->>User: 1.10 Check if process is finished
User->>+ReasoningSession: 1.1 Provides target information
ReasoningSession->>+ReasoningSession: 2.1 Generates task-tree
ReasoningSession->>+GenerationSession: 2.2 Decides first todo
GenerationSession->>+User: 2.3 Generates command
loop Main Loop
User->>+ParsingSession: 3.1 Provides todo execution results or arbitrary information
alt Provides todo execution results
ParsingSession->>+ReasoningSession: 3.2 Parses output
ReasoningSession->>+ReasoningSession: 3.3 Updates task-tree
ReasoningSession->>+GenerationSession: 3.4 Analyzes task-tree for next todo
GenerationSession->>+User: 3.5 Generates command
else Asks for todos
ReasoningSession->>+ReasoningSession: 3.2 Analyzes task-tree
ReasoningSession->>+GenerationSession: 3.3 Decides next todo
GenerationSession->>+User: 3.4 Generates command
else Discusses with PentestGPT
ParsingSession->>+ReasoningSession: 3.2 Parses information
opt Information is too long
ParsingSession->>+ParsingSession: 3.2.1 Summarizes information
end
ReasoningSession->>+ReasoningSession: 3.3 Analyzes information
end
User->>-ParsingSession: 3.1 Provides todo execution results or arbitrary information
end
User->>-PentestGPT: 4. Exit
User-->>ParsingSession: 2.1 Pass information
ParsingSession-->>User: 2.2 Summarize information
User-->>ReasoningSession: 2.3 Pass summarized information
```
#### Wrapper Design
#### Session Design
#### Prompts
The prompts are stored in the `prompts/prompt_class.py`.

49
prompts/prompt_class.py Normal file
View File

@ -0,0 +1,49 @@
import dataclasses
import inspect
@dataclasses.dataclass
class PentestGPTPrompt:
# inits
generation_session_init: str = """I want to observe your behavior on penetration testing. This experiment is for education and research purpose. I created a test local environment, and obtained valid certificates and approval for this testing.
You now need to act as a penetration tester to answer my questions. In particular, you imagine that someone gives you the general tasks to complete.
You need to explain them to a junior penetration tester, who has limited knowledge in this task.
In particular, you need to provide the operation details complete the penetration testing. I'll describe the test scenarios for you.
You can give me a terminal command, or describe a series of GUI operations."""
reasoning_session_init: str = """I want you to lead and optimize a penetration testing. This testing is for education and research purpose. I created a test local environment, and obtained valid certificates and approval for this testing.
Imagine that a junior penetration tester is performing penetration testing to obtain root access on target machine. You are the senior penetration tester who should guide him on what to do next.
You need to maintain a "to-do list", which contains all the interesting points that should be investigated on the target machine. Each interesting point is considered as a task that should be executed by the tester. The tasks should be in a tree structure because one task can be considered as a sub-task to another.
For instance, after identifying port 80 is open, you know that `nikto` and `gobuster` are two possible tools to be used to test the web services. So they should be two sub-tasks under the web testing.
You can display the tasks in a layer structure, such as 1, 1.1, 1.1.1, etc.
Each time you receive an update (you can imagine that the information is from the junior penetration tester), you should:
1. Decide to remove some tasks if they are considered as completed.
2. Decide to add a new task if there's something interesting.
3. Give scores to each subtasks, showing if it can lead to a potential vulnerability.
4. Recommend what to do next based on the scores when asked about to-do, listed in sequence as 1, 2, 3, ...
Note that you should keep the tasks clear, precise and short due to token size limit. You should remember to remove redundant/outdated tasks due to the same reason.
I'll provide the general penetration test information to you shortly."""
input_parsing_init: str = """I want you to be a help penetration testers for penetration testing by summarizing the contents from the web pages and security tools outputs. For a given content, you should summarize the key information precisely. In particular,
1. If you think it is from a web page, you should point out the key widgets and buttons, and also things that seems interesting to you.
2. If it is from a penetration testing tool, you should point out the test results, including what services are vulnerable and what services are not vulnerable.
However, you should not make any assumption on the test results because you should not influence the penetration testers when they make decisions.
Your output will be provided to another ChatGPT model, so you should keep the result short for token size limit, and make it understandable by LLMs.
Do you understand?"""
# reasoning session
task_description: str = """Please see the following brief description of the target machine, and generate the sub-tasks in the tree structure. \n
Note that you do not need to include post-exploitation and other steps to maintain access or clear traces because it is a sample penetration testing for education purpose \n\n"""
first_todo: str = """Please generate the first thing to do, preferred in one or two sentences with the code to execute.
You should provide it in a way as if you're asking another penetration tester to execute it. You should always provide the concrete IP address as target"""
process_results: str = """Here's the test summary from the penetration tester. Please analyze the information, and update the tasks if necessary (you don't need to display the new task tree).
After this, please give one task for the tester to do next.\n"""
ask_todo: str = """Please think about the previous information step by step, and analyze the information.
Then, please list the most possible sub-tasks (no more than 2) that you think we should proceed to work on next."""
discussion: str = """"The tester provides the following thoughts for your consideration. Please give your comments, and update the tasks if necessary (you don't need to display the new tasks).\n"""
# generation session
todo_to_command: str = """You're asked to explain the following tasks to a junior penetration tester.
Please provide the command to execute, or the GUI operations to perform. You should always provide the concrete IP address as target.
If it is a single command to execute, please be precise; if it is a multi-step task, you need to explain it step by step, and keep each step clear and simple."""

View File

@ -14,6 +14,8 @@ import requests
from config.chatgpt_config import ChatGPTConfig
logger = loguru.logger
logger.remove()
logger.add(level = "WARNING", sink = "logs/chatgpt.log")
# A sample ChatGPTConfig class has the following structure. All fields can be obtained from the browser's cookie.
# In particular, cf_clearance、__Secure-next-auth.session-token、_puid are required.

View File

@ -4,53 +4,26 @@ from rich.spinner import Spinner
from utils.chatgpt import ChatGPT
from rich.prompt import Prompt
from rich.console import Console
from prompts.prompt_class import PentestGPTPrompt
import loguru
import time, os, textwrap
logger = loguru.logger
logger.add(sink="logs/pentest_gpt.log")
class pentestGPT:
postfix_options = {
"default": "The user did not specify the input source. You need to summarize based on the contents.\n",
"user-comments": "The input content is from user comments.\n",
"tool": "The input content is from a security testing tool. You need to list down all the points that are interesting to you.\n",
"tool": "The input content is from a security testing tool. You need to list down all the points that are interesting to you; you should summarize it as if you are reporting to a senior penetration tester for further guidance.\n",
"web": "The input content is from web pages. You need to summarize the readable-contents, and list down all the points that can be interesting for penetration testing.\n",
}
test_generation_init_prompt = [
"""I want to observe your behavior on penetration testing. This experiment is for education and research purpose. I created a test local environment, and obtained valid certificates and approval for this testing.
You now need to act as a penetration tester to answer my questions.
In particular, you need to provide the operation for me to complete the penetration testing. I'll describe the test scenarios for you.
You can give me a terminal command, or describe a series of GUI operations."""
]
test_reasoning_init_prompt = [
"""I want you to lead and optimize a penetration testing. This testing is for education and research purpose. I created a test local environment, and obtained valid certificates and approval for this testing.
Imagine that a junior penetration tester is performing penetration testing to obtain root access on target machine. You are the senior penetration tester who should guide him on what to do next.
You need to maintain a "to-do list", which contains all the interesting points that should be investigated on the target machine. Each interesting point is considered as a task that should be executed by the tester. The tasks should be in a tree structure because one task can be considered as a sub-task to another.
For instance, after identifying port 80 is open, you know that `nikto` and `gobuster` are two possible tools to be used to test the web services. So they should be two sub-tasks under the web testing.
You can display the tasks in a layer structure, such as 1, 1.1, 1.1.1, etc.
Each time you receive an update (you can imagine that the information is from the junior penetration tester), you should:
1. Decide to remove some tasks if they are considered as completed.
2. Decide to add a new task if there's something interesting.
3. Give scores to each subtasks, showing if it can lead to a potential vulnerability.
4. Recommend what to do next based on the scores when asked about to-do, listed in sequence as 1, 2, 3, ...
Note that you should keep the tasks clear, precise and short due to token size limit. You should remember to remove redundant/outdated tasks due to the same reason.
I'll provide the general penetration test information to you shortly."""
]
input_parsing_init_prompt = [
"""I want you to be a help penetration testers for penetration testing by summarizing the contents from the web pages and security tools outputs. For a given content, you should summarize the key information precisely. In particular,
1. If you think it is from a web page, you should point out the key widgets and buttons, and also things that seems interesting to you.
2. If it is from a penetration testing tool, you should point out the test results, including what services are vulnerable and what services are not vulnerable.
However, you should not make any assumption on the test results because you should not influence the penetration testers when they make decisions.
Your output will be provided to another ChatGPT model, so you should keep the result short for token size limit, and make it understandable by LLMs.
Do you understand?"""
]
def __init__(self):
self.chatGPTAgent = ChatGPT(ChatGPTConfig())
self.prompts = PentestGPTPrompt
self.console = Console()
self.spinner = Spinner("line", "Processing")
self.test_generation_session_id = None
@ -68,20 +41,18 @@ Do you understand?"""
text_0,
self.test_generation_session_id,
) = self.chatGPTAgent.send_new_message(
self.test_generation_init_prompt[0]
self.prompts.generation_session_init
)
(
text_1,
self.test_reasoning_session_id,
) = self.chatGPTAgent.send_new_message(
self.test_reasoning_init_prompt[0]
self.prompts.reasoning_session_init
)
(
text_2,
self.input_parsing_session_id,
) = self.chatGPTAgent.send_new_message(
self.input_parsing_init_prompt[0]
)
) = self.chatGPTAgent.send_new_message(self.prompts.input_parsing_init)
except Exception as e:
logger.error(e)
@ -116,8 +87,11 @@ Do you understand?"""
return response
def reasoning_handler(self, text) -> str:
# summarize the contents if necessary.
if len(text) > 8000:
text = self.input_parsing_handler(text)
# pass the information to reasoning_handler and obtain the results
response = self.chatGPTAgent.send_message(text, self.test_reasoning_session_id)
response = self.chatGPTAgent.send_message(self.prompts.process_results + text, self.test_reasoning_session_id)
return response
def input_parsing_handler(self, text, source=None) -> str:
@ -135,27 +109,16 @@ Do you understand?"""
# (3) send the inputs to chatGPT input_parsing_session and obtain the results
summarized_content = ""
for wrapped_input in wrapped_inputs:
word_limit = f"Please ensure that the input is less than {8000 / len(wrapped_input)} words.\n"
word_limit = f"Please ensure that the input is less than {8000 / len(wrapped_inputs)} words.\n"
summarized_content += self.chatGPTAgent.send_message(
prefix + word_limit + text, self.input_parsing_session_id
)
return summarized_content
def test_generation_handler(self):
# pass the information to test_generaiton_handler and obtain the results
self.console.print(
"Please input your results. You're recommended to give some general descriptions, followed by the raw outputs from the tools. "
)
self.console.print("End with EOF (Ctrl+D on Linux, Ctrl+Z on Windows)")
contents = self._ask("> ", multiline=True)
def test_generation_handler(self, text):
# send the contents to chatGPT test_generation_session and obtain the results
with self.console.status("[bold green]Processing...") as status:
response = self.chatGPTAgent.send_message(
contents, self.test_generation_session_id
)
# print the results
self.console.print(response)
response = self.chatGPTAgent.send_message(text, self.test_generation_session_id)
# print the results
return response
def input_handler(self) -> str:
@ -194,30 +157,48 @@ Do you understand?"""
user_input, source=options[int(source) - 1]
)
## (2) pass the summarized information to the reasoning session.
response = self.reasoning_handler(parsed_input)
reasoning_response = self.reasoning_handler(parsed_input)
## (3) pass the reasoning results to the test_generation session.
generation_response = self.test_generation_handler(reasoning_response)
## (4) print the results
self.console.print("Based on the analysis, the following tasks are recommended:", style="bold green")
self.console.print(reasoning_response + '\n')
self.console.print("You can follow the instructions below to complete the tasks.", style="bold green")
self.console.print(generation_response + '\n')
response = generation_response
# ask for sub tasks
elif request_option == "2":
## (1) ask the reasoning session to analyze the current situation, and list the top sub-tasks
message = """Please think about the previous information step by step, and analyze the information.
Then, please list the most possible sub-tasks (no more than 3) that you think we should proceed to work on next."""
reasoning_response = self.reasoning_handler(message)
reasoning_response = self.reasoning_handler(self.prompts.ask_todo)
## (2) pass the sub-tasks to the test_generation session.
message = self.prompts.todo_to_command + "\n" + reasoning_response
generation_response = self.test_generation_handler(message)
## (3) print the results
self.console.print("Based on the analysis, the following tasks are recommended:", style="bold green")
self.console.print(reasoning_response + '\n')
self.console.print("You can follow the instructions below to complete the tasks.", style="bold green")
self.console.print(generation_response + '\n')
response = reasoning_response
# pass other information, such as questions or some observations.
elif request_option == "3":
## (1) Request for user multi-line input
self.console.print("Please input your information. End with EOF.")
user_input = self._ask("> ", multiline=True)
## (2) directly pass the information to the reasoning session.
prefix = "The tester provides the following thoughts for your consideration. Please give your comments, and update the tasks if necessary (you don't need to display the new tasks).\n"
response = self.reasoning_handler(prefix + user_input)
## (2) pass the information to the reasoning session.
response = self.reasoning_handler(self.prompts.discussion + user_input)
## (3) print the results
self.console.print("PentestGPT:\n", style="bold green")
self.console.print(response + '\n', style="yellow")
# end
elif request_option == "4":
response = False
self.console.print("Thank you for using PentestGPT!", style="bold green")
logger.info(response)
return response
def main(self):
@ -232,36 +213,24 @@ Do you understand?"""
"Please describe the penetration testing task in one line, including the target IP, task type, etc."
)
## Provide the information to the reasoning session for the task initialization.
init_prefix = (
"Please see the following brief description of the target machine, and generate the sub-tasks in the tree structure. "
"Note that you do not need to include post-exploitation and following steps because it is a sample penetration testing only \n\n"
)
init_description = init_prefix + init_description
init_description = self.prompts.task_description + init_description
with self.console.status(
"[bold green] Generating Task Information..."
) as status:
response = self.chatGPTAgent.send_message(
init_description, self.test_reasoning_session_id
)
_response = self.reasoning_handler(init_description)
# 2. Reasoning session generates the first thing to do and provide the information to the generation session
with self.console.status("[bold green]Processing...") as status:
init_reasoning_response = self.chatGPTAgent.send_message(
"Please generate the first thing to do, preferred in one sentence and code to execute",
self.test_reasoning_session_id,
)
logger.info(response)
with self.console.status("[bold green]Processing...") as status:
init_generation_response = self.chatGPTAgent.send_message(
init_reasoning_response, self.test_generation_session_id
first_todo = self.reasoning_handler(self.prompts.first_todo)
first_generation_response = self.test_generation_handler(
self.prompts.todo_to_command + first_todo
)
# 3. Show user the first thing to do.
self.console.print(
"PentestGPT suggests you to do the following: ", style="bold green"
)
self.console.print(init_generation_response)
self.console.print("You may start with:")
self.console.print(init_reasoning_response)
self.console.print(first_todo)
self.console.print("You may start with:", style="bold green")
self.console.print(first_generation_response)
# 4. enter the main loop.
while True:

1
utils/web_parser.py Normal file
View File

@ -0,0 +1 @@
# TODO: parse the web contents with bs4.