Regex completion and explanation

Implement regex-based sentence splitting for function completion.

You can query a model directly and test the results returned when using different parameter values with the Cloud console, or by calling the Vertex AI API directly.

System instructions

Include a TLDR of what the code does and what fixes you made at the end of your response. Each explanation shouldn't be more than two sentences long.

Prompt Freeform

I've been asked to finish work on this function that splits sentences into a list. Unfortunately, I've been specifically told to use regex for this, with extra requirements explained in the function's docstring. Regex is a pain in the butt at the best of times - could you complete this code for me? Everything you need should be inside the snippet.

import re

def splitSentences(sentences):
"""
Splits a string of sentences into a list.
Each sentence is period-separated, save for cases where
the phrases "Mr." or "Ms." are used, since those
don't end a sentence.
"""
sentenceList = []

# missing code here

return sentenceList

Response

Of course! Dealing with regex can be tricky, but a "negative lookbehind" is perfect for this kind of problem. Here is the completed code.

import re

def splitSentences(sentences):
    """
    Splits a string of sentences into a list.
    Each sentence is period-separated, save for cases where
    the phrases "Mr." or "Ms." are used, since those
    don't end a sentence.
    """
    # This regex pattern uses a "negative lookbehind" (?<!) to find a period
    # that is NOT preceded by "Mr" or "Ms". It also matches any trailing
    # spaces to ensure the next sentence doesn't start with a space.
    pattern = r'(?<!(Mr|Ms))\. *'

    # We split the string by the pattern, and then use a list comprehension
    # to filter out any empty strings that might be created if the
    # input text ends with a period.
    sentenceList = [s for s in re.split(pattern, sentences) if s]

    return sentenceList

# Example Usage:
text = "Hello Mr. Smith. This is a test. Ms. Jones is here. This is the end."
print(splitSentences(text))
# Expected Output: ['Hello Mr. Smith', 'This is a test', 'Ms. Jones is here', 'This is the end']

TLDR

What the code does: This function takes a string of text and splits it into a list of individual sentences. It correctly handles sentences that contain abbreviations like "Mr." or "Ms." without splitting them.

What fixes I made: I used a regex pattern with a negative lookbehind to split the string on periods, except when they are preceded by "Mr" or "Ms". The code also cleans up the resulting list by removing extra whitespace and any empty strings.

Model: gemini-2.5-pro
Temperature: 0.2
Max output tokens: 8192
TopK: 40.0
TopP: 0.95