Creating an AI agent that can interact in real-time and respond intelligently to user queries is a fascinating venture into the realm of artificial intelligence. Today, we’re exploring how to develop such an agent from scratch using Python, without relying on third-party libraries. Here’s a comprehensive guide inspired by Hasan Aboul Hasan’s post on Learn with Hasan, which delves deep into the process of building AI agents.
What is an AI Agent?
An AI agent is a program capable of autonomous actions in an environment to meet specified objectives. Modern AI agents can integrate large language models (LLM) such as OpenAI’s GPT models with external functions to enhance their decision-making capabilities. This allows them to perform tasks that require real-time data, which traditional LLMs, restricted to their training data, cannot handle on their own.
Setting Up Your Project
- Create and Activate a Virtual Environment: Begin by setting up a new Python environment to manage dependencies efficiently.
- Install the OpenAI Package: With the environment ready, install the OpenAI Python package, which will be our primary tool for interfacing with the AI.
- Set Up Project Files: Create essential Python files like
actions.py, prompts.py, andmain.pyto organize your project structure.
Creating Basic AI Functionality
- Generate Text: Utilize the OpenAI API to generate text. This function will be the core of your AI agent, allowing it to process and respond to inputs.
- Define Actions: Implement functions in
actions.pythat the AI can invoke. For instance, a function to fetch response times of websites shows how an AI can perform external tasks. - Define the ReAct Prompt: This is crucial for guiding the AI’s decision-making process. It involves a loop where the AI evaluates the input (Thought), decides on an action (Action), executes it (PAUSE), and then uses the results to respond (Action_Response).
Testing and Iteration
- Simulate Conversations: Test your AI by simulating interactions. This helps ensure the AI behaves as expected.
- Refine the System: Based on test results, refine the system prompts and the logic to improve accuracy and responsiveness.
Advanced Application: SEO Auditor AI Agent
Building on basic setups, one can create more specialized agents such as an SEO Auditor. This agent can analyze websites for SEO performance and provide actionable insights, demonstrating the AI’s practical utility in real-world applications.
Real World Example From AizenCortex.com:
Creating an AI agent that monitors your Gmail for important emails and alerts you via SMS when human intervention is required combines practical AI applications with everyday convenience. This example differs from more complex AI systems, which might involve deeper integrations and more dynamic decision-making processes. Here’s a straightforward guide, including all the necessary code snippets, that even those new to programming or AI can follow:
Step 1: Setting Up Your Environment
To start, ensure Python is installed on your system. Create and activate a virtual environment to manage dependencies:
# Create a virtual environment
python -m venv venv
# Activate the virtual environment
# On Windows
venv\Scripts\activate
# On Unix or MacOS
source venv/bin/activate
Step 2: Install Required Packages
You’ll need several packages to interact with the Gmail API, send SMS messages, and process email content:
# code starting:
pip install google-api-python-client google-auth-oauthlib twilio openai python-dotenv
These packages allow you to:
- google-api-python-client and google-auth-oauthlib: Interact with Gmail.
- twilio: Send SMS messages.
- openai: Analyze email content using AI.
Step 3: Set Up Gmail API
Enable the Gmail API through the Google Developers Console and download your credentials.json file. This will allow your script to access your Gmail account.
Step 1: Create a Project in Google Cloud Console
- Go to the Google Cloud Console.
- Click on “Select a project” at the top of the screen, then click “New Project” in the top-right corner of the popup.
- Enter a project name and select or create a billing account if required. Then, click “Create.”
Step 2: Enable the Gmail API
- Navigate to the Dashboard of your new project in the Google Cloud Console.
- Click on “Navigation Menu” > “APIs & Services” > “Library.”
- In the API Library, search for “Gmail API” and select it.
- Click “Enable” to activate the Gmail API for your project.
Step 3: Configure Consent Screen
- From the API & Services dashboard, go to “OAuth consent screen.”
- Select the user type for your application, typically “External” if it’s going to be accessed by users outside of your organization.
- Fill out the application name, user support email, and developer contact information, then click “Save and Continue.”
- You can skip the scopes section for now and continue, but ensure to add the email and profile scopes if they’re not automatically included.
- In the “Test Users” section, add the email addresses that will use this application. This is necessary for testing before verification.
Step 4: Create Credentials
- Go to “Credentials” in the sidebar under APIs & Services.
- Click “Create Credentials” at the top of the page and choose “OAuth 2.0 Client IDs.”
- Select the application type, usually “Web application,” and name your credentials.
- Under “Authorized redirect URIs,” add the URI where users will be sent after authorization. For local testing, you can use http://localhost:8080.
- Click “Create” and take note of the client ID and client secret shown in the popup.
Step 5: Download the Credentials
After creating your OAuth 2.0 client IDs:
- In the Credentials section, you’ll see your newly created credentials listed. Click the download icon on the right to download the
credentials.jsonfile. - Place this file in your project directory as referenced in the authentication code.
Step 6: Implement Authentication in Your Application
Use the downloaded credentials.json file to authenticate users via OAuth 2.0 in your application. This is essential for allowing your AI agent to access a user’s Gmail inbox securely.
# code starting:
from google_auth_oauthlib.flow import InstalledAppFlow
from googleapiclient.discovery import build
def gmail_authenticate():
SCOPES = ['https://www.googleapis.com/auth/gmail.readonly']
creds = None
# The file token.json stores the user's access and refresh tokens.
try:
creds = Credentials.from_authorized_user_file('token.json', SCOPES)
except:
flow = InstalledAppFlow.from_client_secrets_file('credentials.json', SCOPES)
creds = flow.run_local_server(port=0)
with open('token.json', 'w') as token:
token.write(creds.to_json())
return build('gmail', 'v1', credentials=creds)
This process will handle user authentication and allow your application to make API calls to Gmail on behalf of the user. Make sure your application complies with Google’s API usage policies, especially regarding user data.
Step 4: Access Gmail
Authenticate and authorize your application to interact with Gmail using the following script:
# code starting:
from google.oauth2.credentials import Credentials
from google_auth_oauthlib.flow import InstalledAppFlow
from googleapiclient.discovery import build
def gmail_authenticate():
SCOPES = ['https://www.googleapis.com/auth/gmail.readonly']
creds = None
# Attempt to use existing credentials
try:
creds = Credentials.from_authorized_user_file('token.json', SCOPES)
except:
# Authenticate and save new token
flow = InstalledAppFlow.from_client_secrets_file('credentials.json', SCOPES)
creds = flow.run_local_server(port=0)
with open('token.json', 'w') as token:
token.write(creds.to_json())
return build('gmail', 'v1', credentials=creds)
service = gmail_authenticate()
Step 5: Check Emails for Human Intervention
Analyze incoming emails to determine if a human needs to respond. This uses OpenAI’s API to process the email subject:
# code starting:
def check_emails_needing_response(service):
results = service.users().messages().list(userId='me', labelIds=['INBOX']).execute()
messages = results.get('messages', [])
for message in messages:
msg = service.users().messages().get(userId='me', id=message['id'], format='metadata').execute()
subject = next(header['value'] for header in msg['payload']['headers'] if header['name'] == 'Subject')
# Updated to use the latest method for OpenAI completion
response = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[{"role": "system", "content": "Analyze this email subject for urgency."},
{"role": "user", "content": subject}]
)
decision = response['choices'][0]['message']['content']
if 'yes' in decision.lower():
return message['id']
return None
Step 6: Send SMS Alert
Set up Twilio to send an SMS alert if an email requires your attention:
Create a Twilio Account: Go to the Twilio website and sign up for a free account.
- Verify Your Email and Phone Number: Twilio will ask you to verify your email address and also verify a personal phone number to enable sending messages.
Step 2: Get Twilio API Credentials
- Navigate to the Dashboard: Once your account setup is complete, go to the Twilio dashboard.
- Retrieve Your Account SID and Auth Token: On the dashboard, you will find your Account SID and Auth Token. These are your API credentials that you’ll use to authenticate API requests.
Step 3: Obtain a Twilio Phone Number
- Buy a Twilio Phone Number: From the dashboard, go to the ‘Phone Numbers’ section and click ‘Buy a Number’. You can choose a number that has SMS capabilities.
- Configure Your Number: Ensure that your new number is set up to send and receive SMS messages.
Step 4 and beyond:
You might be asked to configure this and that before your phone number is active. Furthermore, when creating a messaging service, during integration setup you can use drop all incoming messages option “drop the message”. You can leave callback URL box empty.
# code starting:
from twilio.rest import Client
def send_sms(message_id):
account_sid = 'your_account_sid'
auth_token = 'your_auth_token'
client = Client(account_sid, auth_token)
message = client.messages.create(
body=f'Email {message_id} requires your attention!',
from_='+1234567890',
to='+0987654321'
)
print(message.sid)
Step 7: Putting It All Together
Combine the components into a complete workflow that checks your Gmail and sends an SMS alert when necessary:
# code starting:
if __name__ == '__main__':
service = gmail_authenticate()
message_id = check_emails_needing_response(service)
if message_id:
send_sms(message_id)
This setup provides a less complex, yet highly functional AI agent tailored for a specific task—monitoring emails and alerting via SMS. It’s ideal for individuals or businesses looking to automate responses to critical communications efficiently. Ensure to replace placeholders with your actual Twilio credentials and phone numbers, and adjust the OpenAI prompt and email handling logic as needed for your use-case.
Complete Code for main.py:
Make sure you had created .env file and saved your environment variables such as OPENAI_API_KEY and other keys like Twilio’s ones in this file. The file format is like: OPENAI_API_KEY=”your key goes here”. Make sure there is no space between the equal sign and the double quote sign.
# code starting:
import os
import openai
from google.oauth2.credentials import Credentials
from google_auth_oauthlib.flow import InstalledAppFlow
from googleapiclient.discovery import build
from twilio.rest import Client
from dotenv import load_dotenv
# Load environment variables from .env file
load_dotenv(".env")
# Ensure your OpenAI API key is set in your environment variables
openai.api_key = os.getenv("OPENAI_API_KEY")
if not openai.api_key:
raise ValueError("No OpenAI API key provided. Please set the OPENAI_API_KEY environment variable.")
def gmail_authenticate():
SCOPES = ['https://www.googleapis.com/auth/gmail.readonly']
creds = None
try:
creds = Credentials.from_authorized_user_file('token.json', SCOPES)
except Exception as e:
flow = InstalledAppFlow.from_client_secrets_file('credentials.json', SCOPES)
creds = flow.run_local_server(port=0)
with open('token.json', 'w') as token:
token.write(creds.to_json())
return build('gmail', 'v1', credentials=creds)
def check_emails_needing_response(service):
results = service.users().messages().list(userId='me', labelIds=['INBOX']).execute()
messages = results.get('messages', [])
for message in messages:
msg = service.users().messages().get(userId='me', id=message['id'], format='metadata').execute()
subject = next(header['value'] for header in msg['payload']['headers'] if header['name'] == 'Subject')
response = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[{"role": "system", "content": "Analyze this email subject for urgency."},
{"role": "user", "content": subject}]
)
decision = response['choices'][0]['message']['content']
if 'yes' in decision.lower():
return message['id']
return None
def send_sms(message_id):
account_sid = os.getenv("TWILIO_ACCOUNT_SID")
auth_token = os.getenv("TWILIO_AUTH_TOKEN")
if not account_sid or not auth_token:
raise ValueError("Twilio credentials are not set. Please provide TWILIO_ACCOUNT_SID and TWILIO_AUTH_TOKEN.")
client = Client(account_sid, auth_token)
message = client.messages.create(
body=f'Email {message_id} requires your attention!',
from_='+1234567890', # This should be your Twilio number
to='+0987654321' # This should be the recipient's number
)
print(message.sid)
if __name__ == '__main__':
service = gmail_authenticate()
message_id = check_emails_needing_response(service)
if message_id:
send_sms(message_id)
Conclusion
This guide provides a foundation for building autonomous AI agents using Python. By understanding and implementing these concepts, one can create sophisticated systems capable of intelligent, real-time interaction and decision-making.
For a deeper dive into creating AI agents and to explore more complex examples, consider reviewing the detailed guide and resources available at Learn with Hasan. This will offer you both theoretical knowledge and practical coding insights to enhance your projects.


Leave a comment