Leverage A.I. to build faster by understanding its strengths
I've been asked recently by a few friends to explain the process of developing with an A.I. assistant, since they've seen mixed reviews of using this new technology from other programmers. So I thought I would put together a collection of real-world examples and high-level strategies based on what I've learned over the last few months.
A.I. is not great yet at one-shots of large projects or concepts, though this may change quickly. What this means is if you prompt GPT with something like:
"Build me a react app that does x"
or even:
"Build me a react app that does X. It should do Y and Z. It should have login and logout.." and so on...
...it will fail often in these scenarios because its ability to auto-complete (remember, LLMs are just really advanced completion engines) gets weaker as the complexity grows. I will counter this by saying GPT is quite good at nearly one-shotting python scripts, but you should never assume anything works perfectly out the gate.
Your strategy should be getting it to build your Lego pieces. Lot's of small, pure functions is where this technology shines. This makes sense if you think about it: small, focused, pure functions are easy to reason through, even for A.I.! You can achieve this by writing your prompt to walk the A.I. through your code in a more deliberate and explicit way.
I wanted an application to search a bunch of transcribed audio files that have SRTs and —if the search query is found— open the audio file at the timestamp of the selected result. I couldn't find anything online that did this, so I thought I'd see if GPT could help me write this up quickly.
I can technically code almost anything I think up, but the effort to reward ratio dissuades me most of the time. For instance, the time it would take to write this script manually would have most certainly dissuaded me since it was just a "nice to have", but with GPT and Copilot, I can get it done in minutes, not days. I see a lot more developers building interesting and productive tooling because A.I. can bring the workload down.
Here is my initial prompt:
Let's write a python application together. It is a CLI application that responds to user input in the terminal. This should be written assuming python 3.10 or higher. Here is the required functionality:
- User submits a string to search. Given a target folder, the application will recursively search all subfolders for any .srt files that contain the search string.
- Application will list any moments where search string is found in audio files where the name matches the .srt file
- User can choose a result and the application will open VLC player on user's computer and play the audio file at the moment in time associated with the line in the .srt file
GPT-4's initial code was a solid first pass:
import argparse
import os
import re
import subprocess
from pathlib import Path
def search_srt_files(target_folder, search_string):
results = []
for root, _, files in os.walk(target_folder):
for file in files:
if file.endswith('.srt'):
srt_path = Path(root) / file
with open(srt_path, encoding='utf-8') as srt_file:
content = srt_file.read()
if search_string.lower() in content.lower():
results.append(srt_path)
return results
def extract_timestamps(srt_path, search_string):
timestamps = []
with open(srt_path, encoding='utf-8') as srt_file:
for line in srt_file:
if search_string.lower() in line.lower():
prev_line = srt_file.tell() - len(line)
srt_file.seek(prev_line)
timestamp_line = srt_file.readline()
start_time = re.match(r'\d{2}:\d{2}:\d{2},\d{3}', timestamp_line)
if start_time:
timestamps.append(start_time.group())
return timestamps
def open_vlc(audio_path, timestamp):
vlc_path = "C:\\Program Files\\VideoLAN\\VLC\\vlc.exe" # Update this path to the location of VLC on your computer
hh, mm, ss_ms = timestamp.split(':')
ss, ms = ss_ms.split(',')
seconds = int(hh) * 3600 + int(mm) * 60 + int(ss) + int(ms) / 1000
subprocess.Popen([vlc_path, audio_path, f"--start-time={seconds}"])
def main():
parser = argparse.ArgumentParser(description="Search for a string in .srt files and play the associated audio file.")
parser.add_argument("search_string", help="String to search in .srt files.")
parser.add_argument("target_folder", help="Target folder to search for .srt files.")
args = parser.parse_args()
search_string = args.search_string
target_folder = args.target_folder
srt_files = search_srt_files(target_folder, search_string)
if not srt_files:
print("No .srt files found with the search string.")
return
for index, srt_path in enumerate(srt_files, start=1):
print(f"{index}. {srt_path}")
timestamps = extract_timestamps(srt_path, search_string)
for timestamp in timestamps:
print(f" - {timestamp}")
choice = int(input("\nEnter the index of the file you want to play: ")) - 1
srt_path = srt_files[choice]
audio_path = str(srt_path.with_suffix('.mp3')) # Change this to the appropriate audio format (e.g., .mp3, .wav, .m4a, etc.)
timestamp = extract_timestamps(srt_path, search_string)[0] # Assumes the user wants to play the first occurrence
open_vlc(audio_path, timestamp)
if __name__ == "__main__":
main()
Great start, but a few issues:
open_vlc
is assuming a Windows setup, and this will fail on my mac. Now that we have a first pass, we can increase our precision of requests to surgically fix the script and adjust to our needs.
Rewrite the open_vlc function to target macos. Respond only with the code for that function.
def open_vlc(audio_path, timestamp):
vlc_path = "/Applications/VLC.app/Contents/MacOS/VLC" # Path to the VLC executable on macOS
hh, mm, ss_ms = timestamp.split(':')
ss, ms = ss_ms.split(',')
seconds = int(hh) * 3600 + int(mm) * 60 + int(ss) + int(ms) / 1000
subprocess.Popen([vlc_path, audio_path, f"--start-time={seconds}"])
When I tried to run the script, an error occurred:
"1. /Users/johndoe/workspace/recordings/20230414 1300 Recording/20230414 1300 Recording.srt Traceback (most recent call last): File "/Users/johndoe/workspace/audiosearcher/main.py", line 73, in
main() File "/Users/johndoe/workspace/audio searcher/main.py", line 60, in main timestamps = extracttimestamps(srtpath, searchstring) File "/Users/johndoe/workspace/audiosearcher/main.py", line 26, in extracttimestamps prevline = srt_file.tell() - len(line) OSError: telling position disabled by next() call "Explain it
My apologies for the oversight. The error you encountered is because using the next() function or iterator inside a loop disables the tell() method, which is used to get the current position of the file pointer. Instead of using tell() and seek() to move the file pointer, we can change the approach to extract timestamps from the .srt file.
Here's an updated version of the extract_timestamps
function to resolve this issue:
def extract_timestamps(srt_path, search_string):
timestamps = []
with open(srt_path, encoding='utf-8') as srt_file:
lines = srt_file.readlines()
for i, line in enumerate(lines):
if search_string.lower() in line.lower():
j = i - 1
while j >= 0:
timestamp_line = lines[j]
start_time = re.match(r'\d{2}:\d{2}:\d{2},\d{3}', timestamp_line)
if start_time:
timestamps.append(start_time.group())
break
j -= 1
return timestamps
This version reads all the lines in the file first and then iterates through them, eliminating the need for tell() and seek(). If the search string is found, it looks for the closest preceding timestamp in the .srt file.
You can see how GPT is able to reason and fix things as you narrow the focus of its task. Its initial pass on the whole script led to simple errors, but as you drill down into the functions and highlight issues, it's able to fix them with ease.
Later on, GPT modified a code block and it lead to an error. I leveraged Chain of Thought to force it to walk through the process and fix:
The problem is not resolved. This is the code that is problematic:
for file_index, srt_path in enumerate(srt_files, start=1): print(f"{file_index}. {srt_path}") all_timestamps = [] for search_string in search_strings: timestamps = extract_timestamps(srt_path, search_string) all_timestamps.extend([(timestamp, search_string) for timestamp in timestamps]) found_timestamps.append(all_timestamps) all_timestamps.sort() for timestamp_index, (timestamp, search_string) in enumerate(all_timestamps, start=1): occurrence_count = 0 subtitle_line = "" with open(srt_path, encoding='utf-8') as srt_file: for line in srt_file: if search_string.lower() in line.lower(): occurrence_count += 1 if str(timestamp) in line: subtitle_line = line.strip() break truncated_subtitle = subtitle_line[:50] + "..." if len(subtitle_line) > 50 else subtitle_line print(f" {file_index}.{timestamp_index} - {timestamp} - {truncated_subtitle}")
Let's walk through this step by step. We need this code to:
Iterate over every SRT file that was found
Within every found SRT file, iterate over every found timestamp
Print out the line with a file/timestamp index, the timestamp and a truncated subtitle line
Apologies for the confusion earlier. Let's update the problematic code to meet your requirements:
1. Iterate over every SRT file that was found. 2. Within every found SRT file, iterate over every found timestamp. 3. Print out the line with a file/timestamp index, the timestamp, and a truncated subtitle line.
Here's the corrected code:
def main():
# ... (previous code)
found_timestamps = []
for file_index, srt_path in enumerate(srt_files, start=1):
print(f"{file_index}. {srt_path}")
all_timestamps = []
for search_string in search_strings:
timestamps = extract_timestamps(srt_path, search_string)
all_timestamps.extend([(timestamp, search_string) for timestamp in timestamps])
found_timestamps.append(all_timestamps)
all_timestamps.sort()
for timestamp_index, (timestamp, search_string) in enumerate(all_timestamps, start=1):
subtitle_line = ""
with open(srt_path, encoding='utf-8') as srt_file:
lines = srt_file.readlines()
for i, line in enumerate(lines):
if str(timestamp) in line and search_string.lower() in lines[i+1].lower():
subtitle_line = lines[i+1].strip()
break
truncated_subtitle = subtitle_line[:50] + "..." if len(subtitle_line) > 50 else subtitle_line
print(f" {file_index}.{timestamp_index} - {timestamp} - {truncated_subtitle}")
# ... (rest of the code)
This updated code should correctly display the file/timestamp index, timestamp, and truncated subtitle for each found result.
Working through the errors took about 10 min, at which point the script worked perfectly. Overall time spent was about ~20 min with distractions. If I had attempted to code this myself, I'm estimating it would have taken me about one full day of focused coding and debugging.
Recently I needed a Postgres query to return the minimum value of a query. My SQL-fu is sub-par and historically I would have Googled similar examples and adjusted my code to match, but with GPT, I can just provide my code and have it alter it in place.
Given this Postgresql query:
select r.name, ( select sum(si.price_per_ounce * ri.ounces) from store_ingredient si join recipe_ingredient ri on ri.ingredient_id = si.ingredient_id where si.ingredient_id in ( select ri.ingredient_id from recipe_ingredient ri where ri.recipe_id = r.id ) ) as "lowest cost" from recipe r;
How would I change this query so that only the minimum
si.price_per_ounce
is returned?
To modify the given PostgreSQL query to return only the minimum si.price_per_ounce
, you can use the MIN() aggregate function. Here's the modified query:
SELECT r.name, (
SELECT SUM(min_price_per_ounce * ri.ounces) FROM recipe_ingredient ri
JOIN (
SELECT si.ingredient_id, MIN(si.price_per_ounce) AS min_price_per_ounce
FROM store_ingredient si
GROUP BY si.ingredient_id
) AS min_si_prices
ON ri.ingredient_id = min_si_prices.ingredient_id
WHERE ri.recipe_id = r.id
) AS "lowest cost"
FROM recipe r;
In this query, we first find the minimum price per ounce for each ingredient by creating a subquery that selects the ingredient_id
and the minimum price_per_ounce
and groups by ingredient_id
. Then, we join this subquery with the recipe_ingredient
table and use the minimum price per ounce to calculate the total cost of each recipe.
This is a great example of a one-shot. Typically, GPT does far better if you are providing it code samples or examples.
The analogy that has been at the front of my mind lately is that A.I. is now the new IDE. Many years ago, you wrote code in simple text editors. No syntax highlighting, no intelli-sense, no import checking. Now we use tools like VSCode and PyCharm to handle a lot of that minutia for us so we can get to the actual code we need to build.
A.I. now enables developers to focus more on the product they want to build, not the devops scripts needed to get the project running or some necessary but humdrum code.
In addition, these assistants will embolden developers to build more things because the time-to-market will be reduced. How many developers have thought: "I could build X, but with my current bandwidth, it'd take me about 6 months. That's too long". Now that time-frame might be reduced to 6 weeks or even shorter.
As developers, we must embrace A.I. assistants, particularly for the problems they are skilled at solving. If we don't, then the developers who do will be 10x faster and leave us behind.
Sebastian Wildwood
April 23, 2023