Using LLMs for programming

Blog

5 May

Written By Sofia .

LLMs, including ChatGPT, have transformed the programming landscape, serving as powerful accelerators for developer productivity. Leveraging these tools for tasks beyond homework or exercises can significantly enhance efficiency.

This article will detail practical applications of LLMs and ChatGPT in our professional workflows, demonstrating their impact on productivity.

LLM Chat systems

There are multiple Large-Language-Models available on the market right now:

ChatGPT - the most famous system developed by OpenAI. Performs really well with GPT4 models
‎Gemini - chat from Google
GroqChat - really fast Chat LLM
Claude - Chat by Anthropic
HuggingChat - OpenSource LLM models

Uses for software development

There are a lot of interesting use cases for LLMs for programmers, as it seems that LLMs are really good at this kind of stuff.

Explain how to use certain libraries

Show me how can I estimate Click-Through Rate (CTR) from data using PyMC. Be concise. Give an example.

Output:

import pymc as pm
# Step 1: Model Definition
with pm.Model() as model:
# Prior for CTR (p)
p = pm.Beta('p', alpha=1, beta=1) # Uniform prior, you can adjust alpha and beta based on prior knowledge
# Step 2: Data Observation
observations = pm.Binomial('obs', n=1000, p=p, observed=300) # 1000 trials, 300 successes (clicks)
# Step 3: Inference
trace = pm.sample(1000, return_inferencedata=False) # Adjust the number of samples as needed
# Step 4: Posterior Analysis
pm.plot_posterior(trace)

Follow-up questions can be powerful too:

I don’t understand why it’s worth using PyMC. Give me 5 sentence summary why should I be using it here. Structure answer clearly and use bullet points if needed.

Gives me:

Using PyMC for estimating Click-Through Rate (CTR) offers several advantages:
Uncertainty Quantification: PyMC provides a full posterior distribution of the CTR, allowing for a comprehensive understanding of uncertainty and variability in the estimate, rather than just a single point estimate.
Incorporation of Prior Knowledge: With PyMC, you can easily incorporate prior knowledge or beliefs about the CTR through the choice of prior distributions, enhancing the model's relevance and accuracy.
Flexibility and Extensibility: PyMC supports complex hierarchical models and dependencies between variables, making it highly adaptable to more intricate scenarios beyond simple CTR estimation.
Robust Inference Methods: It uses advanced Markov Chain Monte Carlo (MCMC) and variational inference algorithms to efficiently sample from the posterior distribution, ensuring robust and reliable estimates.
Intuitive Model Specification: PyMC's syntax is close to the statistical notation, making the model specification more intuitive and easier to communicate with stakeholders or for educational purposes.

Port code between different programming languages

Write me Databricks SQL version of this code:
def foobar(x):
if x % 2 == 0:
return "foo"
else:
return "bar"

Which results in:

SELECT
CASE
WHEN x % 2 = 0 THEN 'foo'
ELSE 'bar'
END AS result
FROM
your_table_name

Something similar, such as:

Write me Databricks SQL version of this code:

def foobar(x):
if x % 2 == 0:
return "foo"
else:
return "bar"
use SQL create function

Gives you the following:

CREATE OR REPLACE FUNCTION foobar(x INT)
RETURNS STRING
AS
BEGIN
RETURN CASE
WHEN x % 2 = 0 THEN 'foo'
ELSE 'bar'
END;
END;

Code reviews

Suggest me how to improve this code. Be critical but fair. Encourage the use of the best practices of the relevant languages and frameworks. Focus more on higher level feedback like the design of the code instead of tactical things like placement of commas.

This can give you some insightful code comments especially if you are working with a programming language or an environment that you are not familiar with.

Another example:

Review this Python code for me
numbers = [1, 2, 3, 4]
for i, j in zip(range(len(numbers)), numbers):
print (i, j)
What's wrong? How it could be improved?

Output:

Explain how code works

It’s useful when you are dealing with an unfamiliar code base:

Here is the code:
[CODE]
Explain what does it do. Give an overview first and then explain step-by-step.

Let’s take a look at the code of nanoGPT/model.py at master · karpathy/nanoGPT, which would result in:

Gemini will give you a different take:

Help with errors

The code below
[CODE]
produces and error
[ERROR]
How do I fix that?

Code autocompletion

LLM-based tools, such as:

These are also great when writing code with an IDE. You can get quick autocompletion for the most mundane tasks, including:

1) Defining new classes

2) Mapping correct parameter values for use

3) Control predictions by guiding them with comments

Get familiar with Prompt Engineering

Lots of power in LLMs comes from the prompt engineering that can immensely impact your results. It’s important to get familiar with prompt engineering best practices and techniques.

A few basic things that we like to use:

Ask the system to print output in a specific format. Ask it to be less verbose:
- “Be concise, do not repeat the task or instructions. Make the answer shorter than 60 words”
Define a style:
- “Do not give a generic answer. Imagine that you are a technical expert in the field who doesn’t like talking much. Answer from that person's perspective.”
Ensure that context is sufficient. Give as many examples as possible.

There are a few good resources to learn more tricks with prompt engineering:

Summary

LLMs could be a major boost to developer productivity as they can greatly help explaining new things, explaining existing code, and write examples and proof-of-concept code.

Sofia .