diff --git a/.gitignore b/.gitignore index 14b862d..e126ba5 100644 --- a/.gitignore +++ b/.gitignore @@ -6,6 +6,7 @@ __pycache__/ config/ outputs/ .idea +log/ # C extensions *.so diff --git a/PentestGPT_design.md b/PentestGPT_design.md index ba9e14f..6eb5844 100644 --- a/PentestGPT_design.md +++ b/PentestGPT_design.md @@ -1,5 +1,6 @@ ## Design Documentation for PentestGPT -version 0.1, for web penetration testing only +The current design is mainly for web penetration testing + ### General Design PentestGPT provides a unified terminal input handler, and backed by three main components: - A test generation module which generates the exact penetration testing commands or operations for the users to execute. @@ -16,26 +17,31 @@ The handler is the main entry point of the penetration testing tool. It allows p 2. Pass a webpage content. 3. Pass a human description. - -### System Design -#### General Structure -1. Maintain three chat sessions in one class. Each session is for one component. -2. User can select to pass information to one section. In particular. - 1. todo. - 2. pass information. - #### Logic Flow Design -1. User initializes all the sessions. -2. User follows the instruction to perform penetration testing in the following logic. - 1. Reasoning session provides a list of todo for user to choose from. - 2. User chooses one todo, and reasoning session passes the information to generation session. - 3. Generation session generates the exact command for the user to execute. - 4. User executes the command, and passes the output to the parsing session. - 5. Parsing session parses the output, and passes the information to reasoning session. - 6. Reasoning session updates the todo list, and provides the next todo for the user to choose from. -3. User can also pass information to the reasoning session directly. - 1. User passes the information to parser session. - 2. Parser session summarizes the information, and passes the information to the reasoning section. +1. User initializes all the sessions. (**prompt**) +2. User initializes the task by + 1. **User** provides the target information to the **ReasoningSession**. + 2. The **ReasoningSession** generates a *task-tree* based on the target information. + 3. The **ReasoningSession** decides the first todo, and passes the information to the **GenerationSession**. + 4. The **GenerationSession** generates the exact command for the user to execute, and passes it to the **User**. +3. Go into the main loop. The **User** can pick to: + 1. Provide todo execution results to PentestGPT. + 1. The **User** provides the output of the tool to the **ParsingSession**. + 2. The **ParsingSession** parses the output, and passes the information to the **ReasoningSession**. + 3. The **ReasoningSession** updates the *task-tree* based on the information. + 4. Do step 3.2.1-3.2.3 + 2. Ask for todos. + 1. The **ReasoningSession** analyzes the *task-tree*. It decides the next todo, including (1) a natural language description, and (2) the exact command to execute. + 2. The **ReasoningSession** passes the information to the **GenerationSession** for further verification. + 3. The **GenerationSession** generates the exact command for the user to execute, and passes it to the **User**. + 3. Discuss with PentestGPT by providing arbitrary information. + 1. The **User** provides the information to the **ParsingSession**. + 2. The **ParsingSession** parses the information: + - If it is too long, summarize it. + - Otherwise, just rephrase it. + 3. The **ReasoningSession** analyzes the information, and updates the *task-tree*. + + - Exit the program. A flow-chart is shown below: @@ -46,21 +52,34 @@ sequenceDiagram participant GenerationSession participant ParsingSession - User->>ReasoningSession: 1.1 Choose todo - ReasoningSession->>GenerationSession: 1.2 Pass information - GenerationSession->>User: 1.3 Generate command - User->>ParsingSession: 1.4 Execute command - ParsingSession->>User: 1.5 Parse output - User->>ReasoningSession: 1.6 Pass output - ReasoningSession->>ParsingSession: 1.7 Update todo list - ParsingSession->>ReasoningSession: 1.8 Pass information - ReasoningSession->>User: 1.9 Provide list of todo - ReasoningSession->>User: 1.10 Check if process is finished + User->>+ReasoningSession: 1.1 Provides target information + ReasoningSession->>+ReasoningSession: 2.1 Generates task-tree + ReasoningSession->>+GenerationSession: 2.2 Decides first todo + GenerationSession->>+User: 2.3 Generates command + loop Main Loop + User->>+ParsingSession: 3.1 Provides todo execution results or arbitrary information + alt Provides todo execution results + ParsingSession->>+ReasoningSession: 3.2 Parses output + ReasoningSession->>+ReasoningSession: 3.3 Updates task-tree + ReasoningSession->>+GenerationSession: 3.4 Analyzes task-tree for next todo + GenerationSession->>+User: 3.5 Generates command + else Asks for todos + ReasoningSession->>+ReasoningSession: 3.2 Analyzes task-tree + ReasoningSession->>+GenerationSession: 3.3 Decides next todo + GenerationSession->>+User: 3.4 Generates command + else Discusses with PentestGPT + ParsingSession->>+ReasoningSession: 3.2 Parses information + opt Information is too long + ParsingSession->>+ParsingSession: 3.2.1 Summarizes information + end + ReasoningSession->>+ReasoningSession: 3.3 Analyzes information + end + User->>-ParsingSession: 3.1 Provides todo execution results or arbitrary information + end + User->>-PentestGPT: 4. Exit - User-->>ParsingSession: 2.1 Pass information - ParsingSession-->>User: 2.2 Summarize information - User-->>ReasoningSession: 2.3 Pass summarized information ``` -#### Wrapper Design -#### Session Design \ No newline at end of file +#### Prompts +The prompts are stored in the `prompts/prompt_class.py`. + diff --git a/README.md b/README.md index bf8666f..77f868d 100644 --- a/README.md +++ b/README.md @@ -45,24 +45,32 @@ The handler is the main entry point of the penetration testing tool. It allows p ### System Design -#### General Structure -1. Maintain three chat sessions in one class. Each session is for one component. -2. User can select to pass information to one section. In particular. - 1. todo. - 2. pass information. #### Logic Flow Design -1. User initializes all the sessions. -2. User follows the instruction to perform penetration testing in the following logic. - 1. Reasoning session provides a list of todo for user to choose from. - 2. User chooses one todo, and reasoning session passes the information to generation session. - 3. Generation session generates the exact command for the user to execute. - 4. User executes the command, and passes the output to the parsing session. - 5. Parsing session parses the output, and passes the information to reasoning session. - 6. Reasoning session updates the todo list, and provides the next todo for the user to choose from. -3. User can also pass information to the reasoning session directly. - 1. User passes the information to parser session. - 2. Parser session summarizes the information, and passes the information to the reasoning section. +1. User initializes all the sessions. (**prompt**) +2. User initializes the task by + 1. **User** provides the target information to the **ReasoningSession**. + 2. The **ReasoningSession** generates a *task-tree* based on the target information. + 3. The **ReasoningSession** decides the first todo, and passes the information to the **GenerationSession**. + 4. The **GenerationSession** generates the exact command for the user to execute, and passes it to the **User**. +3. Go into the main loop. The **User** can pick to: + 1. Provide todo execution results to PentestGPT. + 1. The **User** provides the output of the tool to the **ParsingSession**. + 2. The **ParsingSession** parses the output, and passes the information to the **ReasoningSession**. + 3. The **ReasoningSession** updates the *task-tree* based on the information. + 4. Do step 3.2.1-3.2.3 + 2. Ask for todos. + 1. The **ReasoningSession** analyzes the *task-tree*. It decides the next todo, including (1) a natural language description, and (2) the exact command to execute. + 2. The **ReasoningSession** passes the information to the **GenerationSession** for further verification. + 3. The **GenerationSession** generates the exact command for the user to execute, and passes it to the **User**. + 3. Discuss with PentestGPT by providing arbitrary information. + 1. The **User** provides the information to the **ParsingSession**. + 2. The **ParsingSession** parses the information: + - If it is too long, summarize it. + - Otherwise, just rephrase it. + 3. The **ReasoningSession** analyzes the information, and updates the *task-tree*. + + - Exit the program. A flow-chart is shown below: @@ -73,26 +81,36 @@ sequenceDiagram participant GenerationSession participant ParsingSession - User->>ReasoningSession: 1.1 Choose todo - ReasoningSession->>GenerationSession: 1.2 Pass information - GenerationSession->>User: 1.3 Generate command - User->>ParsingSession: 1.4 Execute command - ParsingSession->>User: 1.5 Parse output - User->>ReasoningSession: 1.6 Pass output - ReasoningSession->>ParsingSession: 1.7 Update todo list - ParsingSession->>ReasoningSession: 1.8 Pass information - ReasoningSession->>User: 1.9 Provide list of todo - ReasoningSession->>User: 1.10 Check if process is finished + User->>+ReasoningSession: 1.1 Provides target information + ReasoningSession->>+ReasoningSession: 2.1 Generates task-tree + ReasoningSession->>+GenerationSession: 2.2 Decides first todo + GenerationSession->>+User: 2.3 Generates command + loop Main Loop + User->>+ParsingSession: 3.1 Provides todo execution results or arbitrary information + alt Provides todo execution results + ParsingSession->>+ReasoningSession: 3.2 Parses output + ReasoningSession->>+ReasoningSession: 3.3 Updates task-tree + ReasoningSession->>+GenerationSession: 3.4 Analyzes task-tree for next todo + GenerationSession->>+User: 3.5 Generates command + else Asks for todos + ReasoningSession->>+ReasoningSession: 3.2 Analyzes task-tree + ReasoningSession->>+GenerationSession: 3.3 Decides next todo + GenerationSession->>+User: 3.4 Generates command + else Discusses with PentestGPT + ParsingSession->>+ReasoningSession: 3.2 Parses information + opt Information is too long + ParsingSession->>+ParsingSession: 3.2.1 Summarizes information + end + ReasoningSession->>+ReasoningSession: 3.3 Analyzes information + end + User->>-ParsingSession: 3.1 Provides todo execution results or arbitrary information + end + User->>-PentestGPT: 4. Exit - User-->>ParsingSession: 2.1 Pass information - ParsingSession-->>User: 2.2 Summarize information - User-->>ReasoningSession: 2.3 Pass summarized information ``` -#### Wrapper Design - -#### Session Design - +#### Prompts +The prompts are stored in the `prompts/prompt_class.py`. diff --git a/prompts/prompt_class.py b/prompts/prompt_class.py new file mode 100644 index 0000000..32a3e21 --- /dev/null +++ b/prompts/prompt_class.py @@ -0,0 +1,49 @@ +import dataclasses +import inspect + + +@dataclasses.dataclass +class PentestGPTPrompt: + # inits + generation_session_init: str = """I want to observe your behavior on penetration testing. This experiment is for education and research purpose. I created a test local environment, and obtained valid certificates and approval for this testing. + You now need to act as a penetration tester to answer my questions. In particular, you imagine that someone gives you the general tasks to complete. + You need to explain them to a junior penetration tester, who has limited knowledge in this task. + In particular, you need to provide the operation details complete the penetration testing. I'll describe the test scenarios for you. + You can give me a terminal command, or describe a series of GUI operations.""" + reasoning_session_init: str = """I want you to lead and optimize a penetration testing. This testing is for education and research purpose. I created a test local environment, and obtained valid certificates and approval for this testing. +Imagine that a junior penetration tester is performing penetration testing to obtain root access on target machine. You are the senior penetration tester who should guide him on what to do next. +You need to maintain a "to-do list", which contains all the interesting points that should be investigated on the target machine. Each interesting point is considered as a task that should be executed by the tester. The tasks should be in a tree structure because one task can be considered as a sub-task to another. +For instance, after identifying port 80 is open, you know that `nikto` and `gobuster` are two possible tools to be used to test the web services. So they should be two sub-tasks under the web testing. +You can display the tasks in a layer structure, such as 1, 1.1, 1.1.1, etc. +Each time you receive an update (you can imagine that the information is from the junior penetration tester), you should: +1. Decide to remove some tasks if they are considered as completed. +2. Decide to add a new task if there's something interesting. +3. Give scores to each subtasks, showing if it can lead to a potential vulnerability. +4. Recommend what to do next based on the scores when asked about to-do, listed in sequence as 1, 2, 3, ... +Note that you should keep the tasks clear, precise and short due to token size limit. You should remember to remove redundant/outdated tasks due to the same reason. +I'll provide the general penetration test information to you shortly.""" + input_parsing_init: str = """I want you to be a help penetration testers for penetration testing by summarizing the contents from the web pages and security tools outputs. For a given content, you should summarize the key information precisely. In particular, +1. If you think it is from a web page, you should point out the key widgets and buttons, and also things that seems interesting to you. +2. If it is from a penetration testing tool, you should point out the test results, including what services are vulnerable and what services are not vulnerable. +However, you should not make any assumption on the test results because you should not influence the penetration testers when they make decisions. +Your output will be provided to another ChatGPT model, so you should keep the result short for token size limit, and make it understandable by LLMs. +Do you understand?""" + # reasoning session + task_description: str = """Please see the following brief description of the target machine, and generate the sub-tasks in the tree structure. \n + Note that you do not need to include post-exploitation and other steps to maintain access or clear traces because it is a sample penetration testing for education purpose \n\n""" + + first_todo: str = """Please generate the first thing to do, preferred in one or two sentences with the code to execute. + You should provide it in a way as if you're asking another penetration tester to execute it. You should always provide the concrete IP address as target""" + + process_results: str = """Here's the test summary from the penetration tester. Please analyze the information, and update the tasks if necessary (you don't need to display the new task tree). + After this, please give one task for the tester to do next.\n""" + + ask_todo: str = """Please think about the previous information step by step, and analyze the information. + Then, please list the most possible sub-tasks (no more than 2) that you think we should proceed to work on next.""" + + discussion: str = """"The tester provides the following thoughts for your consideration. Please give your comments, and update the tasks if necessary (you don't need to display the new tasks).\n""" + + # generation session + todo_to_command: str = """You're asked to explain the following tasks to a junior penetration tester. + Please provide the command to execute, or the GUI operations to perform. You should always provide the concrete IP address as target. + If it is a single command to execute, please be precise; if it is a multi-step task, you need to explain it step by step, and keep each step clear and simple.""" diff --git a/utils/chatgpt.py b/utils/chatgpt.py index a070759..74d6e2d 100644 --- a/utils/chatgpt.py +++ b/utils/chatgpt.py @@ -14,6 +14,8 @@ import requests from config.chatgpt_config import ChatGPTConfig logger = loguru.logger +logger.remove() +logger.add(level = "WARNING", sink = "logs/chatgpt.log") # A sample ChatGPTConfig class has the following structure. All fields can be obtained from the browser's cookie. # In particular, cf_clearance、__Secure-next-auth.session-token、_puid are required. diff --git a/utils/pentest_gpt.py b/utils/pentest_gpt.py index 9328eaf..7f5cdb5 100644 --- a/utils/pentest_gpt.py +++ b/utils/pentest_gpt.py @@ -4,53 +4,26 @@ from rich.spinner import Spinner from utils.chatgpt import ChatGPT from rich.prompt import Prompt from rich.console import Console - +from prompts.prompt_class import PentestGPTPrompt import loguru import time, os, textwrap - logger = loguru.logger +logger.add(sink="logs/pentest_gpt.log") class pentestGPT: postfix_options = { "default": "The user did not specify the input source. You need to summarize based on the contents.\n", "user-comments": "The input content is from user comments.\n", - "tool": "The input content is from a security testing tool. You need to list down all the points that are interesting to you.\n", + "tool": "The input content is from a security testing tool. You need to list down all the points that are interesting to you; you should summarize it as if you are reporting to a senior penetration tester for further guidance.\n", "web": "The input content is from web pages. You need to summarize the readable-contents, and list down all the points that can be interesting for penetration testing.\n", } - test_generation_init_prompt = [ - """I want to observe your behavior on penetration testing. This experiment is for education and research purpose. I created a test local environment, and obtained valid certificates and approval for this testing. - You now need to act as a penetration tester to answer my questions. - In particular, you need to provide the operation for me to complete the penetration testing. I'll describe the test scenarios for you. - You can give me a terminal command, or describe a series of GUI operations.""" - ] - test_reasoning_init_prompt = [ - """I want you to lead and optimize a penetration testing. This testing is for education and research purpose. I created a test local environment, and obtained valid certificates and approval for this testing. -Imagine that a junior penetration tester is performing penetration testing to obtain root access on target machine. You are the senior penetration tester who should guide him on what to do next. -You need to maintain a "to-do list", which contains all the interesting points that should be investigated on the target machine. Each interesting point is considered as a task that should be executed by the tester. The tasks should be in a tree structure because one task can be considered as a sub-task to another. -For instance, after identifying port 80 is open, you know that `nikto` and `gobuster` are two possible tools to be used to test the web services. So they should be two sub-tasks under the web testing. -You can display the tasks in a layer structure, such as 1, 1.1, 1.1.1, etc. -Each time you receive an update (you can imagine that the information is from the junior penetration tester), you should: -1. Decide to remove some tasks if they are considered as completed. -2. Decide to add a new task if there's something interesting. -3. Give scores to each subtasks, showing if it can lead to a potential vulnerability. -4. Recommend what to do next based on the scores when asked about to-do, listed in sequence as 1, 2, 3, ... -Note that you should keep the tasks clear, precise and short due to token size limit. You should remember to remove redundant/outdated tasks due to the same reason. -I'll provide the general penetration test information to you shortly.""" - ] - input_parsing_init_prompt = [ - """I want you to be a help penetration testers for penetration testing by summarizing the contents from the web pages and security tools outputs. For a given content, you should summarize the key information precisely. In particular, -1. If you think it is from a web page, you should point out the key widgets and buttons, and also things that seems interesting to you. -2. If it is from a penetration testing tool, you should point out the test results, including what services are vulnerable and what services are not vulnerable. -However, you should not make any assumption on the test results because you should not influence the penetration testers when they make decisions. -Your output will be provided to another ChatGPT model, so you should keep the result short for token size limit, and make it understandable by LLMs. -Do you understand?""" - ] def __init__(self): self.chatGPTAgent = ChatGPT(ChatGPTConfig()) + self.prompts = PentestGPTPrompt self.console = Console() self.spinner = Spinner("line", "Processing") self.test_generation_session_id = None @@ -68,20 +41,18 @@ Do you understand?""" text_0, self.test_generation_session_id, ) = self.chatGPTAgent.send_new_message( - self.test_generation_init_prompt[0] + self.prompts.generation_session_init ) ( text_1, self.test_reasoning_session_id, ) = self.chatGPTAgent.send_new_message( - self.test_reasoning_init_prompt[0] + self.prompts.reasoning_session_init ) ( text_2, self.input_parsing_session_id, - ) = self.chatGPTAgent.send_new_message( - self.input_parsing_init_prompt[0] - ) + ) = self.chatGPTAgent.send_new_message(self.prompts.input_parsing_init) except Exception as e: logger.error(e) @@ -116,8 +87,11 @@ Do you understand?""" return response def reasoning_handler(self, text) -> str: + # summarize the contents if necessary. + if len(text) > 8000: + text = self.input_parsing_handler(text) # pass the information to reasoning_handler and obtain the results - response = self.chatGPTAgent.send_message(text, self.test_reasoning_session_id) + response = self.chatGPTAgent.send_message(self.prompts.process_results + text, self.test_reasoning_session_id) return response def input_parsing_handler(self, text, source=None) -> str: @@ -135,27 +109,16 @@ Do you understand?""" # (3) send the inputs to chatGPT input_parsing_session and obtain the results summarized_content = "" for wrapped_input in wrapped_inputs: - word_limit = f"Please ensure that the input is less than {8000 / len(wrapped_input)} words.\n" + word_limit = f"Please ensure that the input is less than {8000 / len(wrapped_inputs)} words.\n" summarized_content += self.chatGPTAgent.send_message( prefix + word_limit + text, self.input_parsing_session_id ) return summarized_content - def test_generation_handler(self): - # pass the information to test_generaiton_handler and obtain the results - self.console.print( - "Please input your results. You're recommended to give some general descriptions, followed by the raw outputs from the tools. " - ) - self.console.print("End with EOF (Ctrl+D on Linux, Ctrl+Z on Windows)") - contents = self._ask("> ", multiline=True) + def test_generation_handler(self, text): # send the contents to chatGPT test_generation_session and obtain the results - with self.console.status("[bold green]Processing...") as status: - response = self.chatGPTAgent.send_message( - contents, self.test_generation_session_id - ) - # print the results - self.console.print(response) - + response = self.chatGPTAgent.send_message(text, self.test_generation_session_id) + # print the results return response def input_handler(self) -> str: @@ -194,30 +157,48 @@ Do you understand?""" user_input, source=options[int(source) - 1] ) ## (2) pass the summarized information to the reasoning session. - response = self.reasoning_handler(parsed_input) + reasoning_response = self.reasoning_handler(parsed_input) + ## (3) pass the reasoning results to the test_generation session. + generation_response = self.test_generation_handler(reasoning_response) + ## (4) print the results + self.console.print("Based on the analysis, the following tasks are recommended:", style="bold green") + self.console.print(reasoning_response + '\n') + self.console.print("You can follow the instructions below to complete the tasks.", style="bold green") + self.console.print(generation_response + '\n') + response = generation_response + # ask for sub tasks elif request_option == "2": ## (1) ask the reasoning session to analyze the current situation, and list the top sub-tasks - message = """Please think about the previous information step by step, and analyze the information. - Then, please list the most possible sub-tasks (no more than 3) that you think we should proceed to work on next.""" - reasoning_response = self.reasoning_handler(message) + reasoning_response = self.reasoning_handler(self.prompts.ask_todo) + ## (2) pass the sub-tasks to the test_generation session. + message = self.prompts.todo_to_command + "\n" + reasoning_response + generation_response = self.test_generation_handler(message) + ## (3) print the results + self.console.print("Based on the analysis, the following tasks are recommended:", style="bold green") + self.console.print(reasoning_response + '\n') + self.console.print("You can follow the instructions below to complete the tasks.", style="bold green") + self.console.print(generation_response + '\n') response = reasoning_response + # pass other information, such as questions or some observations. elif request_option == "3": ## (1) Request for user multi-line input self.console.print("Please input your information. End with EOF.") user_input = self._ask("> ", multiline=True) - ## (2) directly pass the information to the reasoning session. - prefix = "The tester provides the following thoughts for your consideration. Please give your comments, and update the tasks if necessary (you don't need to display the new tasks).\n" - response = self.reasoning_handler(prefix + user_input) + ## (2) pass the information to the reasoning session. + response = self.reasoning_handler(self.prompts.discussion + user_input) + ## (3) print the results + self.console.print("PentestGPT:\n", style="bold green") + self.console.print(response + '\n', style="yellow") # end elif request_option == "4": response = False + self.console.print("Thank you for using PentestGPT!", style="bold green") - logger.info(response) return response def main(self): @@ -232,36 +213,24 @@ Do you understand?""" "Please describe the penetration testing task in one line, including the target IP, task type, etc." ) ## Provide the information to the reasoning session for the task initialization. - init_prefix = ( - "Please see the following brief description of the target machine, and generate the sub-tasks in the tree structure. " - "Note that you do not need to include post-exploitation and following steps because it is a sample penetration testing only \n\n" - ) - init_description = init_prefix + init_description + init_description = self.prompts.task_description + init_description with self.console.status( "[bold green] Generating Task Information..." ) as status: - response = self.chatGPTAgent.send_message( - init_description, self.test_reasoning_session_id - ) - + _response = self.reasoning_handler(init_description) # 2. Reasoning session generates the first thing to do and provide the information to the generation session with self.console.status("[bold green]Processing...") as status: - init_reasoning_response = self.chatGPTAgent.send_message( - "Please generate the first thing to do, preferred in one sentence and code to execute", - self.test_reasoning_session_id, - ) - logger.info(response) - with self.console.status("[bold green]Processing...") as status: - init_generation_response = self.chatGPTAgent.send_message( - init_reasoning_response, self.test_generation_session_id + first_todo = self.reasoning_handler(self.prompts.first_todo) + first_generation_response = self.test_generation_handler( + self.prompts.todo_to_command + first_todo ) # 3. Show user the first thing to do. self.console.print( "PentestGPT suggests you to do the following: ", style="bold green" ) - self.console.print(init_generation_response) - self.console.print("You may start with:") - self.console.print(init_reasoning_response) + self.console.print(first_todo) + self.console.print("You may start with:", style="bold green") + self.console.print(first_generation_response) # 4. enter the main loop. while True: diff --git a/utils/web_parser.py b/utils/web_parser.py new file mode 100644 index 0000000..e02d637 --- /dev/null +++ b/utils/web_parser.py @@ -0,0 +1 @@ +# TODO: parse the web contents with bs4. \ No newline at end of file