Skip to main content

Generative AI

NLP to SQL: Build Data Questioning and Answering Application

Artificial Intelligence Concept Cpu Quantum Computing

Introduction to NLP to SQL

With the availability of powerful large language models, we now can convert natural language into accurate SQL (NLP to SQL) with a single callout, enabling users to express their information needs naturally and efficiently.

In this blog, we will explore the creation of a Data Questioning and Answering web application.

Technology Stack

  • Python
  • Azure OpenAI
  • Sqlite3


Nlp To Sql

ER Diagram



While making a call to the Azure OpenAI model, we need to carefully craft our prompt. Precision is crucial, as an ambiguous prompt may result in unintended outcomes.

We have set up the following prompt, where we replace the user’s NLP query.

prompt = """
    Given the following SQL tables, your job is to write queries given a user’s request.
    CREATE TABLE Branches (
        City TEXT,
        Country TEXT,
        EmployeeNumber INTEGER,
        GM TEXT
        DepartmentID INTEGER PRIMARY KEY,
        Name TEXT
        Name TEXT,
        Position TEXT,
        Salary REAL,
        DepartmentID INTEGER,
        BranchID INTEGER,
        FOREIGN KEY (DepartmentID) REFERENCES Departments(DepartmentID),
        FOREIGN KEY (BranchID) REFERENCES Branches(BranchID)
    Desired Format: Return valid sql query strictly for above tables otherwise return undefined
    Translate the following natural language query into SQL: {}
    SQL Query:

Azure OpenAI

NLP to SQL is one of the advantages of the Azure OpenAI model we can leverage for our Data Questioning and Answering application. In the code below, we have used the text-davinci-003 model, which converts human language to SQL

# Using Azure OpenAI API
from openai import AzureOpenAI
api_key = config["AZURE_OPENAI_KEY"]
client = AzureOpenAI(
    api_key = api_key,
response = client.completions.create(


HTML for Nlp to Sql

NLP to SQL Examples

Below are some human language queries I tried to input through an HTML form. I validated their corresponding SQL queries and their actual responses against the database. There are situations where we receive an SQL query that results in an error when executed on the actual database. We handle this scenario by asking the user a different or more appropriate question

/* 1. List all employees with their corresponding department names. */
        T Employees.Name, Departments.Name 
        FROM Employees 
        LEFT JOIN Departments 
        ON Employees.DepartmentID = Departments.DepartmentID;

/* 2. Find the total number of employees in each department. */
        SELECT DepartmentID, COUNT(*) AS Total_Employees
        FROM Employees
        GROUP BY DepartmentID;
/* 3. Identify employees who are managers along with their department names. */
        SELECT e.Name, e.Position, d.Name
        FROM Employees e
        JOIN Departments d ON e.DepartmentID = d.DepartmentID
        WHERE e.Position = 'Manager'

/* 4. List all employees in the IT department with their salaries. */
        SELECT EmployeeID, Name, Position, Salary FROM Employees 
        WHERE DepartmentID = (SELECT DepartmentID FROM Departments WHERE Name = 'IT')

/* 5. Find the average salary for employees in each department. */
        SELECT DepartmentID, AVG(Salary) 
        FROM Employees 
        GROUP BY DepartmentID;

/* 6. Identify the department with the highest total salary expense. */
        SELECT Departments.Name, SUM(Employees.Salary) AS TotalExpense
        FROM Employees
        INNER JOIN Departments
        On Employees.DepartmentID = Departments.DepartmentID
        GROUP BY Departments.Name
        ORDER BY TotalExpense DESC
        LIMIT 1;

/* 7. List employees who do not belong to any department. */
        SELECT * FROM Employees WHERE DepartmentID IS NULL;

/* 8. Find the department with the least number of employees. */
        SELECT Departments.Name
        FROM Employees
        LEFT JOIN Departments
        ON Departments.DepartmentID = Employees.DepartmentID
        GROUP BY Departments.Name
        ORDER BY COUNT(Employees.EmployeeID) ASC
        LIMIT 1;

/* 9. List employees who belong to the Finance department and have a salary above 1000 */
        SELECT *
        FROM Employees 
        WHERE DepartmentID = (SELECT DepartmentID FROM Departments WHERE Name = 'Finance')
        AND Salary > 1000;

/* 10. What is the salary of employee ID 101?*/
        "SELECT Salary FROM Employees WHERE EmployeeID = 101;


Hallucination refer to instances where the model generates or predicts SQL queries that are not accurate or relevant to the user’s intent. This might result in incorrect or unintended database operations. I have tried all the human language queries above, but it didn’t generate invalid SQL


Users can manipulate data either through a user form or via a prompt. We should handle all scenarios where the end user can only retrieve data from the application. In the above prompt section, I have included the desired format to constrain the model to generate a valid query; otherwise, it returns ‘undefined’ to handle invalid user queries. Users can ask anything which is not relevant to a particular database.

@app.route("/qa", methods=["POST"])
def prompt_to_qa():
    nlp_query = request.form.get("query")
    sql_query = get_sql_query(nlp_query)
    if(sql_query.strip().lower() == "undefined"):
        output = "Not a valid question"
        result = str(execute_sql_query(sql_query))
        check_json = is_json(result)
            output = json.loads(result)
            output = result
    return { "nlp_query": nlp_query, "sql_query": sql_query, "result": output}


SQL is case-insensitive when it comes to keywords, such as SELECT, FROM, and WHERE. However, it is case-sensitive when it comes to identifiers such as table names, column names, and aliases. When converting natural language queries to SQL, developers need to be careful with the case sensitivity of these identifiers.


You can modify the technology stack mentioned above to streamline the development of applications, regardless of UI, LLM, or database. Additionally, you can leverage libraries such as pandas. By experimenting with different combinations of the technology stack, we can enhance the capabilities of the application.

Important Links

Thoughts on “NLP to SQL: Build Data Questioning and Answering Application”

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Vishal Chaware

Vishal Chaware is Senior Technical Consultant at Perficient. He is a certified Marketing Cloud Developer with experience in Marketing Cloud API.

More from this Author

Follow Us