Co-Developing With An A.I. Assistant

Leverage A.I. to build faster by understanding its strengths


I've been asked recently by a few friends to explain the process of developing with an A.I. assistant, since they've seen mixed reviews of using this new technology from other programmers. So I thought I would put together a collection of real-world examples and high-level strategies based on what I've learned over the last few months.

A.I. can build your bricks, not your wall

A.I. is not great yet at one-shots of large projects or concepts, though this may change quickly. What this means is if you prompt GPT with something like:

"Build me a react app that does x"

or even:

"Build me a react app that does X. It should do Y and Z. It should have login and logout.." and so on...

...it will fail often in these scenarios because its ability to auto-complete (remember, LLMs are just really advanced completion engines) gets weaker as the complexity grows. I will counter this by saying GPT is quite good at nearly one-shotting python scripts, but you should never assume anything works perfectly out the gate.

Your strategy should be getting it to build your Lego pieces. Lot's of small, pure functions is where this technology shines. This makes sense if you think about it: small, focused, pure functions are easy to reason through, even for A.I.! You can achieve this by writing your prompt to walk the A.I. through your code in a more deliberate and explicit way.

Example 1: A python script for searching transcribed audio files

I wanted an application to search a bunch of transcribed audio files that have SRTs and —if the search query is found— open the audio file at the timestamp of the selected result. I couldn't find anything online that did this, so I thought I'd see if GPT could help me write this up quickly.

I can technically code almost anything I think up, but the effort to reward ratio dissuades me most of the time. For instance, the time it would take to write this script manually would have most certainly dissuaded me since it was just a "nice to have", but with GPT and Copilot, I can get it done in minutes, not days. I see a lot more developers building interesting and productive tooling because A.I. can bring the workload down.

Here is my initial prompt:

Let's write a python application together. It is a CLI application that responds to user input in the terminal. This should be written assuming python 3.10 or higher. Here is the required functionality:

  1. User submits a string to search. Given a target folder, the application will recursively search all subfolders for any .srt files that contain the search string.
  2. Application will list any moments where search string is found in audio files where the name matches the .srt file
  3. User can choose a result and the application will open VLC player on user's computer and play the audio file at the moment in time associated with the line in the .srt file

GPT-4's initial code was a solid first pass:

import argparse
import os
import re
import subprocess
from pathlib import Path

def search_srt_files(target_folder, search_string):
    results = []
    for root, _, files in os.walk(target_folder):
        for file in files:
            if file.endswith('.srt'):
                srt_path = Path(root) / file
                with open(srt_path, encoding='utf-8') as srt_file:
                    content = srt_file.read()
                    if search_string.lower() in content.lower():
                        results.append(srt_path)
    return results

def extract_timestamps(srt_path, search_string):
    timestamps = []
    with open(srt_path, encoding='utf-8') as srt_file:
        for line in srt_file:
            if search_string.lower() in line.lower():
                prev_line = srt_file.tell() - len(line)
                srt_file.seek(prev_line)
                timestamp_line = srt_file.readline()
                start_time = re.match(r'\d{2}:\d{2}:\d{2},\d{3}', timestamp_line)
                if start_time:
                    timestamps.append(start_time.group())
    return timestamps

def open_vlc(audio_path, timestamp):
    vlc_path = "C:\\Program Files\\VideoLAN\\VLC\\vlc.exe"  # Update this path to the location of VLC on your computer
    hh, mm, ss_ms = timestamp.split(':')
    ss, ms = ss_ms.split(',')
    seconds = int(hh) * 3600 + int(mm) * 60 + int(ss) + int(ms) / 1000
    subprocess.Popen([vlc_path, audio_path, f"--start-time={seconds}"])

def main():
    parser = argparse.ArgumentParser(description="Search for a string in .srt files and play the associated audio file.")
    parser.add_argument("search_string", help="String to search in .srt files.")
    parser.add_argument("target_folder", help="Target folder to search for .srt files.")
    args = parser.parse_args()

    search_string = args.search_string
    target_folder = args.target_folder

    srt_files = search_srt_files(target_folder, search_string)
    if not srt_files:
        print("No .srt files found with the search string.")
        return

    for index, srt_path in enumerate(srt_files, start=1):
        print(f"{index}. {srt_path}")
        timestamps = extract_timestamps(srt_path, search_string)
        for timestamp in timestamps:
            print(f"  - {timestamp}")

    choice = int(input("\nEnter the index of the file you want to play: ")) - 1
    srt_path = srt_files[choice]
    audio_path = str(srt_path.with_suffix('.mp3'))  # Change this to the appropriate audio format (e.g., .mp3, .wav, .m4a, etc.)
    timestamp = extract_timestamps(srt_path, search_string)[0]  # Assumes the user wants to play the first occurrence
    open_vlc(audio_path, timestamp)

if __name__ == "__main__":
    main()

Great start, but a few issues:

  1. open_vlc is assuming a Windows setup, and this will fail on my mac.
  2. This script only let's you play a found audio file, not a specific timestamp matching a string match
  3. You can't quit out of the selection; you have to Control-C to close the script if no matches are found

Now that we have a first pass, we can increase our precision of requests to surgically fix the script and adjust to our needs.

Rewrite the open_vlc function to target macos. Respond only with the code for that function.

def open_vlc(audio_path, timestamp):
    vlc_path = "/Applications/VLC.app/Contents/MacOS/VLC"  # Path to the VLC executable on macOS
    hh, mm, ss_ms = timestamp.split(':')
    ss, ms = ss_ms.split(',')
    seconds = int(hh) * 3600 + int(mm) * 60 + int(ss) + int(ms) / 1000
    subprocess.Popen([vlc_path, audio_path, f"--start-time={seconds}"])

When I tried to run the script, an error occurred:

"1. /Users/johndoe/workspace/recordings/20230414 1300 Recording/20230414 1300 Recording.srt Traceback (most recent call last): File "/Users/johndoe/workspace/audiosearcher/main.py", line 73, in main() File "/Users/johndoe/workspace/audiosearcher/main.py", line 60, in main timestamps = extracttimestamps(srtpath, searchstring) File "/Users/johndoe/workspace/audiosearcher/main.py", line 26, in extracttimestamps prevline = srt_file.tell() - len(line) OSError: telling position disabled by next() call "

Explain it

My apologies for the oversight. The error you encountered is because using the next() function or iterator inside a loop disables the tell() method, which is used to get the current position of the file pointer. Instead of using tell() and seek() to move the file pointer, we can change the approach to extract timestamps from the .srt file.

Here's an updated version of the extract_timestamps function to resolve this issue:

def extract_timestamps(srt_path, search_string):
    timestamps = []
    with open(srt_path, encoding='utf-8') as srt_file:
        lines = srt_file.readlines()
        for i, line in enumerate(lines):
            if search_string.lower() in line.lower():
                j = i - 1
                while j >= 0:
                    timestamp_line = lines[j]
                    start_time = re.match(r'\d{2}:\d{2}:\d{2},\d{3}', timestamp_line)
                    if start_time:
                        timestamps.append(start_time.group())
                        break
                    j -= 1
    return timestamps

This version reads all the lines in the file first and then iterates through them, eliminating the need for tell() and seek(). If the search string is found, it looks for the closest preceding timestamp in the .srt file.

You can see how GPT is able to reason and fix things as you narrow the focus of its task. Its initial pass on the whole script led to simple errors, but as you drill down into the functions and highlight issues, it's able to fix them with ease.

Later on, GPT modified a code block and it lead to an error. I leveraged Chain of Thought to force it to walk through the process and fix:

The problem is not resolved. This is the code that is problematic:

for file_index, srt_path in enumerate(srt_files, start=1): print(f"{file_index}. {srt_path}") all_timestamps = [] for search_string in search_strings: timestamps = extract_timestamps(srt_path, search_string) all_timestamps.extend([(timestamp, search_string) for timestamp in timestamps]) found_timestamps.append(all_timestamps) all_timestamps.sort() for timestamp_index, (timestamp, search_string) in enumerate(all_timestamps, start=1): occurrence_count = 0 subtitle_line = "" with open(srt_path, encoding='utf-8') as srt_file: for line in srt_file: if search_string.lower() in line.lower(): occurrence_count += 1 if str(timestamp) in line: subtitle_line = line.strip() break truncated_subtitle = subtitle_line[:50] + "..." if len(subtitle_line) > 50 else subtitle_line print(f" {file_index}.{timestamp_index} - {timestamp} - {truncated_subtitle}")

Let's walk through this step by step. We need this code to:

  1. Iterate over every SRT file that was found

  2. Within every found SRT file, iterate over every found timestamp

  3. Print out the line with a file/timestamp index, the timestamp and a truncated subtitle line

Apologies for the confusion earlier. Let's update the problematic code to meet your requirements:

1. Iterate over every SRT file that was found. 2. Within every found SRT file, iterate over every found timestamp. 3. Print out the line with a file/timestamp index, the timestamp, and a truncated subtitle line.

Here's the corrected code:

def main():
    # ... (previous code)

    found_timestamps = []
    for file_index, srt_path in enumerate(srt_files, start=1):
        print(f"{file_index}. {srt_path}")
        all_timestamps = []
        for search_string in search_strings:
            timestamps = extract_timestamps(srt_path, search_string)
            all_timestamps.extend([(timestamp, search_string) for timestamp in timestamps])
        found_timestamps.append(all_timestamps)
        all_timestamps.sort()

        for timestamp_index, (timestamp, search_string) in enumerate(all_timestamps, start=1):
            subtitle_line = ""
            with open(srt_path, encoding='utf-8') as srt_file:
                lines = srt_file.readlines()
                for i, line in enumerate(lines):
                    if str(timestamp) in line and search_string.lower() in lines[i+1].lower():
                        subtitle_line = lines[i+1].strip()
                        break

            truncated_subtitle = subtitle_line[:50] + "..." if len(subtitle_line) > 50 else subtitle_line
            print(f"  {file_index}.{timestamp_index} - {timestamp} - {truncated_subtitle}")

    # ... (rest of the code)

This updated code should correctly display the file/timestamp index, timestamp, and truncated subtitle for each found result.

Working through the errors took about 10 min, at which point the script worked perfectly. Overall time spent was about ~20 min with distractions. If I had attempted to code this myself, I'm estimating it would have taken me about one full day of focused coding and debugging.

Example 2: Improving a SQL query

Recently I needed a Postgres query to return the minimum value of a query. My SQL-fu is sub-par and historically I would have Googled similar examples and adjusted my code to match, but with GPT, I can just provide my code and have it alter it in place.

Given this Postgresql query:

select r.name, ( select sum(si.price_per_ounce * ri.ounces) from store_ingredient si join recipe_ingredient ri on ri.ingredient_id = si.ingredient_id where si.ingredient_id in ( select ri.ingredient_id from recipe_ingredient ri where ri.recipe_id = r.id ) ) as "lowest cost" from recipe r;

How would I change this query so that only the minimum si.price_per_ounce is returned?

To modify the given PostgreSQL query to return only the minimum si.price_per_ounce, you can use the MIN() aggregate function. Here's the modified query:

SELECT r.name, (
  SELECT SUM(min_price_per_ounce * ri.ounces) FROM recipe_ingredient ri
  JOIN (
    SELECT si.ingredient_id, MIN(si.price_per_ounce) AS min_price_per_ounce
    FROM store_ingredient si
    GROUP BY si.ingredient_id
  ) AS min_si_prices
  ON ri.ingredient_id = min_si_prices.ingredient_id
  WHERE ri.recipe_id = r.id
) AS "lowest cost"
FROM recipe r;

In this query, we first find the minimum price per ounce for each ingredient by creating a subquery that selects the ingredient_id and the minimum price_per_ounce and groups by ingredient_id. Then, we join this subquery with the recipe_ingredient table and use the minimum price per ounce to calculate the total cost of each recipe.

This is a great example of a one-shot. Typically, GPT does far better if you are providing it code samples or examples.

A.I. is the new IDE

The analogy that has been at the front of my mind lately is that A.I. is now the new IDE. Many years ago, you wrote code in simple text editors. No syntax highlighting, no intelli-sense, no import checking. Now we use tools like VSCode and PyCharm to handle a lot of that minutia for us so we can get to the actual code we need to build.

A.I. now enables developers to focus more on the product they want to build, not the devops scripts needed to get the project running or some necessary but humdrum code.

In addition, these assistants will embolden developers to build more things because the time-to-market will be reduced. How many developers have thought: "I could build X, but with my current bandwidth, it'd take me about 6 months. That's too long". Now that time-frame might be reduced to 6 weeks or even shorter.

As developers, we must embrace A.I. assistants, particularly for the problems they are skilled at solving. If we don't, then the developers who do will be 10x faster and leave us behind.


Sebastian Wildwood

April 23, 2023